Important Notice: Our web hosting provider recently started charging us for additional visits, which was unexpected. In response, we're seeking donations. Depending on the situation, we may explore different monetization options for our Community and Expert Contributors. It's crucial to provide more returns for their expertise and offer more Expert Validated Answers or AI Validated Answers. Learn more about our hosting issue here.

How is text search implemented?

April 26, 2017implemented search text

0

Posted

How is text search implemented?

1 Answer

0

Posted

The system tries to match query string to a unique GeneHub gene index. After a best match is found, the GeneHub gene index is used to retrieve pre-computed GEPIS result. The gene attributes and synonyms are stored in two tables: GENE and GENE_SYNONYMS, respectively. The cross-references between GeneHub gene indexes and database records are saved in DBXREF table. DBXREF and GENE_SYNONYMS tables are consulted in turn to find an exact match to the given query string. A begin-search is automatically performed if there is no exact match at first round. There are limitations in MySQL text search: • It doesn’t support function index. • Hyphenated words are treated as two words in MySQL. • MySQL comes with a default stop word list and the number in the query is ignored by default. To overcome the limitations and make text search case-insensitive and consistent (e.g. IL-8, il 8 and IL8, should all return same result), we added additional columns, SEARCH_TEXT and XREF_ID_SEARCH in the GENE_SYNON