MySQL Forums
Forum List  »  Newbie

Re: Using a variable to create a table name
Posted by: Thomas Stieve
Date: October 27, 2017 12:24PM

Thank you very much for your time. I really appreciate it. This is how a record looks:

En_Title Or_Title Total Country Page_id
ZZ Top ZZ_Top 1 Germany 184951
Zyklon B Zyklon_B 2 Germany 57684
Zygosity Hemizygotie 1 Germany 77954

where En_title is the title in English, Or_Title is the title is in the language of the database, total is the number of unique IP edits that edited the page, country is the country of IP address and Page_id is the the ID of the page in that language database.

This is the structure for all of the language tables except for the English one which only has a title in English. I was planning on using the English title as the Primary Key for all the tables. At this stage, I am only working with these data points, not the full texts. That will be for another project working off the results from the this.

Getting to this point has taken a lot of processing. I only have a pseudo-code for what I need to do next. Some of these table are very large (Russian > 400K records, English >1.7 million records), so I felt overwhelmed. I guess I'm inefficiently leaning towards creating different tables due to the shear size of everything.

I think, correct me if you see something better, I can join now from the 269 different language tables into one "world" table using English as a primary key, keeping count of how many countries have edited their equivalent of the English page as I go along. That doesn't seem too difficult for me.

Question: I have to figure out to keep track of which countries as I do this might be tricky for me. It seems like one table allows only some 30-odd columns. There are titles that easily have more than 30 countries editing it.

Wikipedia has a separate table with page ID's of each article and the title of that page in all the different languages. It is set up as two columns, one page ID in that language db and the corresponding title in the other languages. I tried to start using it, but it was too big an undertaking for every single article across 271 languages (two language dbs have no geolocatable IP edits in 2016). So, I used this table for find the English title for titles from 269 dbs to get to this point.

For the localized titles, titles that are edited in only one country, I am only looking at 6 different languages. I can understand these languages and will do content analysis comparing the content of the localized articles with the globalized articles, the list of which I am creating now. I am now thinking that I can create a table of 6 countries where these language are spoken, for example Hindi in India, and populate it with the titles from all the languages that were edited in that country.

Question: I am tripping a little bit on this point. I need to find titles that were edited in only one country regardless of language. As you've pointed out, many countries have many languages, so it can't be a one-to-one relationship. I am thinking that I can find all the titles that are for example in Hindi located in India and then go backwards and using the interlanguage link table in Wikipedia find the other titles in other languages and geolocate them if they have IP edits.

Sorry if this seems very amateurish to you. I'm learning as I go along, and no one on my committee really works with MySQL. Also, much of the scholarship working on Wikipedia does not geolocate IP's from all the languages. They usually only study the English Wikipedia or simply interlanguage links. I'm learning as I go along.

Options: ReplyQuote




Sorry, you can't reply to this topic. It has been closed.

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.