owned this note
owned this note
Published
Linked with GitHub
# OpenSearch Community Meeting - Late Sept
Agenda:
- Accelerating vector search in OpenSearch (GSI Technology)
Feel free to comment on the agenda before the meeting if you want to add an item or have a question. During the meeting the agenda will be unlocked for collaborative editing / note taking. After the meeting the agenda will be set to read-only mode.
[Early Sept Agenda](https://hackmd.io/RtOxsG3cRQ-pi4lBRRlV-A)
Chat Log:
10:00:38 From Hugo Albarracín to Everyone:
Hi everyone
10:00:41 From Hugo Albarracín to Everyone:
ok
10:01:21 From Kyle Davis to Everyone:
👋 Hi everyone!
10:02:12 From Hugo Albarracín to Everyone:
Can I record the presentation?
10:18:46 From Shenoy Pratik to Everyone:
Hi Rudy, great presentation! Does APU have a different floating point precision than CPU(like using integer vs. float16/32)? If so, does it take a hit on accuracy?
10:20:53 From Brian Grabau to Everyone:
I joined late, what are the ec2 instance types for these new cpu's
10:21:12 From Rudy Kirzhner to Everyone:
Hi Shenoy, the vector search on APU takes a binary vector, from fp32. We use a high accuracy vectorization neural hash, and it can be tweaked for performance /accuracy tradeoff
10:22:05 From Rudy Kirzhner to Everyone:
Since the OpenSearch plugin itself is going to be open source certain things can be adjusted by the users
10:22:06 From Brian Grabau to Everyone:
Yup
10:23:12 From Rudy Kirzhner to Everyone:
Hi Brian, any instance that runs OpenSearch is fine. The APU is used as a service, and not as a hardware reflected on your AIM
10:25:45 From Andreas (Liberty Global) to Everyone:
@Tools:
specifically, are there any plans to fork elasticsearch.py from elastic or make it compatible with Opensearch?
As you mentoined previously, a license check has been introduced by elastic as of version 7.14.0
https://elasticsearch-py.readthedocs.io/en/v7.14.0/transports.html#product-check-on-first-request
That change didn't received a warm welcome from the community tough: https://github.com/elastic/elasticsearch-py/issues/1667
I had to ask our vendor to use version 7.13.4 as stated per compatibility matrix https://opensearch.org/docs/clients/index/ (`pip install elasticsearch==7.13.4`)
10:26:32 From Erin Verbeck-Lane to Everyone:
They’re already forked, Andreas!
10:26:59 From Eli Fisher to Everyone:
@Rudy, what are the best ways for someone to try out the GSI features for OpenSearch?
10:27:29 From Erin Verbeck-Lane to Everyone:
https://github.com/opensearch-project/opensearch-py
10:27:39 From Andreas (Liberty Global) to Everyone:
Thanks a lot !
10:28:17 From Rudy Kirzhner to Everyone:
rkirzhner@gsitechnology.com
10:28:56 From Eli Fisher to Everyone:
https://pypi.org/project/opensearch-py/
10:29:01 From Ryan Paras to Everyone:
for QA - any other client library updates to note?
10:29:12 From Rudy Kirzhner to Everyone:
@Shenoi Also, fp32 is available
10:29:14 From Hasan Asfoor to Everyone:
There is a lot of development in the deep4j framework which enables the use of transformers such as BERT from Java, any plans to support semantic search capabilities based on such models?
10:29:15 From Erin Verbeck-Lane to Everyone:
Oh snap it’s on pypi now
10:29:39 From Brian Grabau to Everyone:
Is AWS going to release an ARM ec2 instance equivalent to i3.8xlarge and d3.8xlarge
10:30:05 From Shenoy Pratik to Everyone:
@Rudy thanks for the answer. This looks quite promising.
10:30:12 From Eli Fisher to Everyone:
The go client also has a release cut for it https://github.com/opensearch-project/opensearch-go/releases/tag/1.0.0
10:30:48 From Seth Muthukaruppan to Everyone:
What is the process to report security vulnerabilities against the project? A recent vulnerability scan flagged few packages and we would like to open issues to get them looked into
10:30:57 From Rudy Kirzhner to Everyone:
@Hasan - you could use the vector search for semantic search
10:34:01 From Hasan Asfoor to Everyone:
The only issue with vector search is that the text-to-vec process on the input has to be done by the client which quite an overhead.
10:34:43 From Hasan Asfoor to Everyone:
what is GSI?
10:35:43 From Eli Fisher to Everyone:
https://www.gsitechnology.com/
10:36:21 From Rudy Kirzhner to Everyone:
https://www.gsitechnology.com/OpenSearch
10:36:22 From Hasan Asfoor to Everyone:
Thanks
10:36:29 From Rudy Kirzhner to Everyone:
@Hasan - we offer vector search acceleration
10:36:30 From Brian Grabau to Everyone:
Where? We can present
10:36:38 From Paul Borgermans to Everyone:
Thank you!
10:36:40 From Hugo Albarracín to Everyone:
Thank you from Colombia
10:36:52 From Brian Grabau to Everyone:
Link?
10:36:59 From Ryan Paras to Everyone:
many thanks everyone!
10:37:04 From Andreas (Liberty Global) to Everyone:
Thank you!
10:37:09 From Rudy Kirzhner to Everyone:
Thanks all.
10:37:28 From Anton Rubin to Everyone:
Thanks everyone!
10:37:33 From Kevin Garcia to Everyone:
https://discuss.opendistrocommunity.dev/c/community-meetings/54
10:37:45 From DETERT, JON to Everyone:
What is bucket level alerting?
10:37:50 From Salten to Everyone:
thank you
10:38:52 From DETERT, JON to Everyone:
So, Is bucket level alerting similar to elastalert’s query_key concept?
10:39:21 From DETERT, JON to Everyone:
Got it, thx!
10:39:39 From Seth Muthukaruppan to Everyone:
Any plans to support enrich processor
10:40:48 From Brian Grabau to Everyone:
o
10:40:51 From Seth Muthukaruppan to Everyone:
https://www.elastic.co/guide/en/elasticsearch/reference/current/ingest-enriching-data.html
10:40:57 From DETERT, JON to Everyone:
I see that there will be a new cli to assist in upgrading opendistro to open search. Will it also be possible to restore a snap from an elastic search v7.10.2 cluster into an open search cluster?
10:40:57 From Brian Grabau to Everyone:
enrich processor = I have use cases for that
10:41:02 From Sven R (@hackacad) to Kyle Davis(Direct Message):
We just finished the official FreeBSD port of OpenSearch and OpenSearch Dashbords (https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=257558 / https://www.freshports.org/textproc/opensearch/).
Any interest in setting up a small WebEx/whatever session to talk about upcoming FreeBSD support?
10:41:18 From Eli Fisher to Everyone:
@Brian do share! :)
10:42:40 From Sven R (@hackacad) to Everyone:
We just finished the official FreeBSD port of OpenSearch and OpenSearch Dashboards (https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=257558 / https://www.freshports.org/textproc/opensearch/).
Any interest in setting up (a) additional session(s) to talk about upcoming FreeBSD support?
10:42:58 From Romain Tartière (smortex) to Everyone:
It's WIP
10:43:35 From Brian Grabau to Everyone:
Depends how it works but we use memcache, like misp, hvt lists etc.. to enrich based on data types present in the log but we drop data into elastic then move over to memchache if we can just use data in an indexes... we can skip a step
10:43:56 From Brian Grabau to Everyone:
yes
10:46:12 From Seth Muthukaruppan to Everyone:
Thanks!
10:46:15 From Eli Fisher to Everyone:
Thanks everyone!
10:46:15 From Romain Tartière (smortex) to Everyone:
Thanks!
10:46:20 From sezuan to Everyone:
Thanks!
10:46:21 From Alessandro Bignami to Everyone:
thanks
10:46:22 From Hugo Albarracín to Everyone:
Thanks