Information needed:
City, and Intersection
City, and Address
And in some instances locations.
Problems:
There is a lot of slang when naming locations
Some words are abbreviated
Some tweets don't have a location
Some tweets say that they will be there at a certain time only
Some tweets have two locations
I want to add a certainty percentage of how accurate we think we are, and if it falls below a certain threshold, we need to say to view tweet for a more accurate location.
I'm thinking I'll need a list of all the cities in LA. So the php code will run and check to see if a city is in the tweet. Probably the same with the streets? I need to look into it.
Some rules I'll have to write
Remove unnecessary words like: Food, Duck, Taco, the, a,
Words "@" "at" "in" are usually followed by the address/ location
Cross check to find LA city
Check to find "/" and then check if the words next to it are street names
Check to find "&"; and then check if the words next to it are street names
Check to find "and" and then check if the words next to it are street names
Check to find Street names
Check to find Locations Like Convention Center or Rodeo Drive.
Time: Check if numbers are times or addresses
Check if it's a 1 or 2 digit number followed by hr pm am o'clock noon afternoon etc....
Check if the numbers have a "-" or "to"
List of all the cities in LA (88) I'll have to compile a list with slang words also
List of all the Streets in LA
Thursday, May 20, 2010
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment