Location Inference

The Location Inference classifier displays the city, region, and country of origin of a media post. It can be used in conjunction with other classifiers and search terms to find location-specific content. This is a specialized multi-class text classifier trained using different social media post metadata.

For Twitter English posts, the Location Inference Model shows the inferred city location of:

  • 44 cities in the United States
  • 30 non-US cities.

Also, the region provides 37 states from the United States. This set of inferences can locate posts from 26 countries. The list of these locations is provided in the below table.

In addition to the above 26 countries, to extend the prediction capability of the location models, we also have multiple Country Location Inference Models for Twitter. For English Twitter posts, this model can locate posts from the following four countries:

  • Thailand
  • Turkey
  • Puerto Rico
  • Colombia

To increase the country coverage, we also have a country-based model for Spanish language Twitter data. This can be used to find Spanish language country-specific content. This model is able to detect if the country of origin is:

  • Mexico
  • Colombia
  • Argentina
  • Chile
  • Spain
  • Peru
  • Other (refers to other Spanish-speaking countries for a Spanish language post)

In addition, for Twitter Japanese posts, Datastreamer country location classifier is able to detect if twitter posts originated from Japan or not.


Location Inference on Threads and Instagram

For English Instagram and Threads social media posts, the Country Location Inference classifier detects whether posts are from the following countries:

  • United States
  • United Kingdom
  • Canada
  • Australia
  • Brazil
  • Colombia
  • Turkey
  • Thailand
  • France
  • Germany
  • Mexico
  • India
  • New Zealand

To increase the country coverage, we also have a country-based model for Spanish language Instagram and Threads data source. This can be used to find Spanish language country-specific content. This model is able to detect if the country of origin is:

  • Mexico
  • Colombia
  • Argentina
  • Chile
  • Spain
  • Peru
  • Other (refers to other Spanish-speaking countries for a Spanish language post)

In addition, for Instagram and Threads Japanese posts, Datastreamer country location classifier is able to detect if Instagram/Threads posts originated from Japan or not. There is low volume of Japanese post country location inference for Threads.

Statistics

TypeSpeedPartner Type
Stream Integrated Classifier +Post ProcessingInstantDatastreamer Internal

Example Use Cases

  • In conjunction with aggregations and sentiment, high-level assessments of sentiment towards a brand in a specific city could be delivered to a product's dashboard.
  • Spanish language location Inference can give a country-level view of Spanish content rather than relying on keywords or language.
  • Japanese language location Inference can give a country-level view of Japanese content rather than relying on keywords or language.
  • Location Inference can give a more city-level view of content than relying on keywords or language.
  • Location Inference can be used in its inverse to remove certain cities or countries from the results of content in a specific area.

Compatible Data Sources

As a stream-integrated classifier, it is run on ingestion for specific sources.

Applicable Data SourcesCompatible?
data365_twitter_keywordsYes, English, Spanish, and Japanese only
data365_twitter_profilesYes, English, Spanish, and Japanese only
wsl_instagramYes, English and Spanish only
wsl_threadsYes, English only

📘

Recipe Available

View the below recipe to see it in action, and easily view how to integrate it into your own data pipeline.

Output

This location inference classifier outputs three labels: city, region, and country of origin for a given text and an associated confidence score. If the confidence is under 0.5 or not in one of the trained labels or the output is unknown then the "Other " tag is applied.
The label would be one of the city names for the city and ISO 3166-1 code for region and country.

"location_inference": {
                    "label": "Detroit",	 
                    "confidence": 0.5681
                },
"location_inference_region": {
                    "label": "MI",	 
                    "confidence": 0.8361
                },
"location_inference_country": {
                    "label": "US",	 
                    "confidence": 0.8681
                },

It should be noted that, unlike city-based models, the country-based models will return the name and confidence scores of the predicted country (not city or region).

Post-Processing Usage

Compatible Data Sources

As a Post-Processing operation, it can be run on any data source. It provides 61 cities' inferred locations for integrated data sources and 74 cities inferred locations for post-processing operations. The following cities, regions, and countries are available:

  • Amsterdam, NL
  • Anchorage, AK, US
  • Atlanta, GA, US
  • Austin, TX, US
  • Baltimore, MD, US
  • Barcelona, ES
  • Berlin, DE
  • Boston, MI, US
  • Brussels, BE
  • Budapest, HU
  • Cairo, EG
  • Cape Town, ZA
  • Charleston, SC, US
  • Charlotte, NC, US
  • Cheyenne, WY, US
  • Chicago, IL, US
  • Columbus, OH, US
  • Copenhagen, DK
  • Dallas, TX, US
  • Delhi, IN
  • Denver, CO, US
  • Des Moines, IA, US
  • Detroit, MA, US
  • Doha, QA
  • Dubai, AE
  • Dublin, IE
  • El Paso, TX, US
  • Fargo, ND, US
  • Fort Worth, TX, US
  • Houston, TX, US
  • Huntsville, AL, US
  • Indianapolis, IN, US
  • Jacksonville, FL, US
  • Johannesburg, ZA
  • Kansas City, MO, US
  • Las Vegas, NV, US
  • Lima, PE
  • London, UK
  • Los Angeles, CA, US
  • Louisville, KY, US
  • Madrid, ES
  • Manila, PH
  • Melbourne, AU
  • Memphis, TN, US
  • Mexico City, MX
  • Milwaukee, WI, US
  • Minneapolis, MN, US
  • Montreal, QC, CA
  • Mumbai, IN
  • Naples, IT
  • New Orleans, LA, US
  • New York, NY, US
  • Newark, NJ, US
  • Oklahoma City, OK, US
  • Paris, FR
  • Philadelphia, PA, US
  • Phoenix, AZ, US
  • Portland, OR, US
  • Prague, CZ
  • Riyadh, SA
  • Sacramento, CA, US
  • Salt Lake City, UT, US
  • San Francisco, CA, US
  • San Diego, CA, US
  • Santa Fe, NM
  • Seattle, WA, US
  • Singapore, SG
  • Sydney, AU
  • Tokyo, JP
  • Toronto, CA
  • Virginia Beach, VA, US
  • Washington DC, WA, US
  • Wichita, KS, US
  • Zurich, CH

The following countries are available for English Twitter Posts used with real-time and post-processing options:

  • Puerto Rico: PR
  • Thailand: TH
  • Turkey: TR
  • Colombia: CO

The following countries are available for Spanish Twitter posts:

  • Mexico: MX
  • Colombia: CO

The following country is available for Japanese Twitter, Instagram and Threads posts:

  • Japan: JP

The following countries are available for English Threads and Instagram Posts:

  • United States: US
  • United Kingdom: UK
  • Canada: CA
  • Australia: AU
  • Brazil: BR
  • Colombia: CO
  • Turkey: TR
  • Thailand: TH
  • France: FR
  • Germany: DE
  • Mexico: MX
  • India: IN
  • New Zealand: NZ

The following countries are available for Spanish Instagram and Threads posts:

  • Mexico: MX
  • Colombia: CO
  • Spain: ES
  • Chile: CL
  • Argentina: AR
  • Peru: PE