and you want your lookup to be quick (so you don't want to do a request to another service).
IPv6 is not supported.
IPv6 is not supported.
Quick installation:
- Download my geoip library with hostip.info zip archive.
- Unpack it into your appengine app directory
- Use the library like this:
import geoip class YourHandler(webapp.RequestHandler): def get(self): (country_code, country_name) = geoip.query(self.request.remote_addr) self.response.out.write("You are from %s(%s)" % (country_name, country_code))
I will probably not update the data files, because I'm a lazy bastard. But you can create new datafiles yourself. See bellow.
Debug interface
Debug interface
If you want to activate the debug interface, just add the following lines to your app.yaml
- url: /geoip.*
script: /geoip.py
login: admin And now the details of my geoip library (if you are interested)
The "geoip.py"library allows you to query the datafiles
The easiest way to do quick lookups, is to have a local datafile containing a mapping from ip address to country code.
Of course this file should be as small as possible. I came up with a solution where all data from hostip.info fits into a little more than 1MB and all free data from MaxMind fits in less than 300KB.
On the first request I map one of these files into RAM and after that I can lookup the country for an IP addresses in less than 1ms.
The structure of the data files is the following:
- The first line maps the country ids to 2-character country codes
- The second line maps the country ids to the english country name
- The rest is binary data with the following structure
- 1st byte is the A level part of the ip
- 2nd byte is the B level part of the ip
- 3rd byte ist he C level part of the ip
- 4th byte is the country code (luckily there are less than 256 country codes)
A lookup is performed by doing a binary search over the binary data (one element is 4 bytes).
The lower bound of the binary search result contains in the 4th byte the country id and I map that to the country code and the country name and return it. Quite simple and straight forward.
The "geoip.py"library allows you to query the datafiles
#!/usr/bin/env python """ Created by Andrin von Rechenberg, 2011. This library is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. Example usage on Appengine: class YourHandler(webapp.RequestHandler): def get(self): (country_code, country_name) = geoip.query(self.request.remote_addr) self.response.out.write("You are from %s(%s)" % (country_name, country_code)) To add the stats UI just add these lines to your app.yaml: - url: /geoip.* script: /geoip.py login: admin Cheers, -Andrin """ import os import re countries = {} ip_ranges = "" country_names = {} def query(ip, raiseError=False): global countries, country_names, ip_ranges if not ip_ranges: setup() try: parts = ip.split(".") find = "%c%c%c" % (int(parts[0]), int(parts[1]), int(parts[2])) except: # IPv6 is not supported return (None, None) lo = 0 hi = (len(ip_ranges)) / 4 while lo < hi: mid = (lo+hi) // 2 midval = ip_ranges[mid * 4:mid * 4 + 3] if midval < find: lo = mid + 1 elif midval > find: hi = mid else: break found = hi * 4 - 4 if ip_ranges[found:found + 3] > find: found -= 4 id = ord(ip_ranges[found + 3:found + 4]) return (countries.get(id), country_names.get(id)) def setup(): global countries, country_names, ip_ranges data = open(os.path.join(os.path.dirname(__file__), "geoip.bin"), "rb").read() country_data = data[:data.find("\n")] pattern = re.compile("(\d+)\:([^\|]+)\|") for country in pattern.findall(country_data): countries[int(country[0])] = country[1] data = data[data.find("\n") + 1:] country_names_data = data[:data.find("\n")] for name in pattern.findall(country_names_data): country_names[int(name[0])] = name[1] ip_ranges = data[data.find("\n") + 1:] """ *************** ALL THE CODE BELLOW IS OPTIONAL *************** """ from google.appengine.ext import webapp from google.appengine.ext.webapp import util class GeoIPStats(webapp.RequestHandler): def stats(self): global countries, country_names, ip_ranges if not ip_ranges: setup() count = {} pos = 0 this = 0 while pos < len(ip_ranges): next = 256 * 256 * 256 if pos + 4 < len(ip_ranges): next = (ord(ip_ranges[pos]) * 256 * 256 + ord(ip_ranges[pos + 1]) * 256 + ord(ip_ranges[pos + 2])) id = ord(ip_ranges[pos + 3]) if id not in count: count[id] = 0 count[id] += 256 * (next - this) this = next pos += 4 result = [(countries.get(x), country_names.get(x), count[x]) for x in count] result.sort() return result def get(self): self.response.out.write( "<html><body><center><b>Appengine GeoIP by N-Dream</b><br><br>") ip = self.request.get("ip") show_stats = self.request.get("stats") if ip: (cc, name) = query(ip) self.response.out.write("IP is from: %s (%s)" % (cc, name)) elif show_stats: self.response.out.write("<table>") for stat in self.stats(): cc = stat[0] name = stat[1] count = str(stat[2]) if len(count) > 3: for i in range(len(count) - 3, 0, -3): count = count[:i] + "'" + count[i:] self.response.out.write( "<tr><td><b>%s</b></td><td>%s</td><td align=right>%s</td></tr>" % (cc, name, count)) self.response.out.write("</table>") else: self.response.out.write( "<form>IP:<input type=text name=ip><input type=submit></form>") self.response.out.write("<a href=?stats=1>Statistics</a>") self.response.out.write("</center></body></html>") application = webapp.WSGIApplication([ ('.*', GeoIPStats), ], debug=True) def main(): util.run_wsgi_app(application)
Creating new compressed datafiles:
If you want to create new datafiles, just run one of the following python scripts.
Create a datafile from hostip.info:
#!/usr/bin/env python
"""
Created by Andrin von Rechenberg, 2011.
This library is free software: you can redistribute it
and/or modify it under the terms of the GNU General Public License
as published by the Free Software Foundation, either version 3 of
the License, or (at your option) any later version.
Example usage:
python geoip_hostipinfo.py
Cheers,
-Andrin
"""
import gzip
import sys
import re
import urllib
import cStringIO
out = open("geoip.bin","wb")
country_names = {}
ip_ranges = {}
print "Downloading... (might take a while)"
zipped = urllib.urlopen("http://db.hostip.info/mirror/" +
"hostip_current.sql.gz").read()
data = gzip.GzipFile(fileobj=cStringIO.StringIO(zipped)).read()
for line in data.split("\n"):
if line.startswith("INSERT INTO `countries`"):
p = re.compile("\((\d+),'(.*?)','([A-Z]+)'\)")
for x in p.findall(line):
out.write(x[0] + ":" + x[2].replace("|", "") + "|")
country_names[x[0]] = " ".join([c[0] + c[1:].lower()
for c in
x[1].replace("\\", "").split(" ")])
out.write("\n")
for x in country_names:
out.write(x + ":" + country_names[x].replace("|", "") + "|")
out.write("\n")
if line.startswith("INSERT INTO `ip4_"):
a = line[line.find("ip4_") + 4:]
a = int(a[:a.find("`")])
print "Processing A Level IP adress block " + str(a) + "."
for block in line.split(")"):
if block.strip() == ";":
continue
(b, c, country, city, time) = block.split("(")[1].split(",")
ip_ranges[a * 256 * 256 + int(b) * 256 + int(c)] = int(country)
if not ip_ranges or not country_names:
print "Countries or IP ranges are missing"
sys.exit()
print "Writing file..."
last_country = None
for a in range(256):
for b in range(256):
for c in range(256):
country = ip_ranges.get(a * 256 * 256 + b * 256 + c, 0)
if country != last_country:
out.write("%c%c%c%c" % (a, b, c, country))
last_country = country;
out.close()... or create a datafile from maxmind.com:
#!/usr/bin/env python
"""
Created by Andrin von Rechenberg, 2011.
This library is free software: you can redistribute it
and/or modify it under the terms of the GNU General Public License
as published by the Free Software Foundation, either version 3 of
the License, or (at your option) any later version.
Example usage:
python geoip_maxmind.py
Cheers,
-Andrin
"""
import cStringIO
import sys
import re
import urllib
import zipfile
out = open("geoip.bin","wb")
countries = {}
country_names = {}
ip_ranges = {}
print "Downloading... (might take a while)"
zipped = urllib.urlopen("http://geolite.maxmind.com/download/geoip/database/" +
"GeoIPCountryCSV.zip").read()
print "Processing..."
zip = zipfile.ZipFile(cStringIO.StringIO(zipped))
csv_filename = None
for file in zip.filelist:
if file.filename.endswith(".csv"):
csv_filename = file.filename
break;
if not csv_filename:
print "csv file not found in archive"
sys.exit()
for line in zip.read(csv_filename).split("\n"):
if not line:
continue
parts = [x.replace('"', "") for x in line.split(",")]
if parts[4] not in countries:
countries[parts[4]] = len(countries) + 1
country_name = ",".join(parts[5:]).replace('"', "")
country_names[country_name] = len(countries)
for i in range(int(parts[2]) / 256, int(parts[3]) / 256 + 1):
ip_ranges[i] = countries[parts[4]]
for country in countries:
out.write(str(countries[country]) + ":" + country.replace("|", "") + "|")
out.write("\n")
for country in country_names:
out.write(str(country_names[country]) + ":" + country.replace("|", "") + "|")
out.write("\n")
if not ip_ranges or not countries:
print "Countries or IP ranges are missing"
sys.exit()
print "Writing file..."
last_country = None
for a in range(256):
for b in range(256):
for c in range(256):
country = ip_ranges.get(a * 256 * 256 + b * 256 + c, 0)
if country != last_country:
out.write("%c%c%c%c" % (a, b, c, country))
last_country = country;
out.close()Cheers,
-Andrin
PS: Code was colorized using pygments.org
Super useful. Thanks Andrin.
AntwortenLöschenWho is the headmaster of this blog??? can you contact me on my email acidflame@hotmail.it i want tell with you for an offer thanks, i'm italian webmastar Nicola Cirillo excuse me if i write here but i don't know where i write you.
AntwortenLöschen