MySQL Forums
Forum List  »  Newbie

How efficiently update frequency of words?
Posted by: Andrzej Borucki
Date: March 11, 2023 06:01AM

I am trying create big index of English words frequency based on English Wikipedia.
Wikipedia is indexed by about 227 thousands block, each (maybe except last) block has 100 pages.
I m using SpaCy for finding base form of word and part of speech this word.
Not alone word, but pair (word,pos) must be unique (case sensitive). For example , in index must be both (‘name’,’VERB’) and (‘name’,’NOUN’) with frequencies.
I have table words:
CREATE TABLE `words` (
`word` varchar(45) NOT NULL,
`pos` varchar(6) NOT NULL,
`count` bigint unsigned NOT NULL DEFAULT '0',
PRIMARY KEY (`word`,`pos`),
)
Each block give several thousands word with its count.
How can I do:
- if pair word,pos not exists in table: add this pair to table, set count from SQL
- if exists, for pair word,pos count := table count + sql.count
Is better directly update from SQL or first fill small table block_words with the same structure and next SQL query add words from small table to main table?

Options: ReplyQuote


Subject
Written By
Posted
How efficiently update frequency of words?
March 11, 2023 06:01AM


Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.