top button
Flag Notify
    Connect to us
      Site Registration

Site Registration

Reading log and saving data to DB using python script

0 votes
831 views

I have a Ubuntu server running NGINX that logs data for me. I want to write a python script that reads my customized logs and after a little rearrangement save the new data into my DB (postgresql).

The process should run about every 5 minutes and i'm expecting large chunks of data on several 5 minute windows..

My plan for achieving this is to install python on the server, write a script and add it to cron.

My question is what the simplest way to do this? should I use any python frameworks?
For my python app I'm using Django, but on this server I just need to read a file, do some manipulation and save to DB.

posted Aug 14, 2013 by Anderson

Share this question
Facebook Share Button Twitter Share Button LinkedIn Share Button

2 Answers

+1 vote

Rarely do I put "framework" and "simplest way" in the same set.

I would do 1 of 2 things:

  • Write a simple script that reads lines from stdin, and writes to the db. Make sure it gets run in init before nginx does and tail -F -n 0 to that script. Don't worry about the 5-minute cron.

  • Similar to above but if you want to use cron also store in the db the offset of the last byte read in the file, then when the cron job kicks off again seek to that position + 1 and begin reading, at EOF write the offset again.

This is irrespective of any log rotating that is going on behind the scenes, of course.

answer Aug 14, 2013 by Deepankar Dubey
Not sure i understood the first options and what it means to run before the nginx.

The second options sound more like what i had in mind. Aren't there any components like this written that i can use?  Since the log fills up a lot i'm having trouble reading so much data and writing it all to the DB in a reasonable amount of time.

The table receiving the new data is somewhat complex.. the table's purpose is to save data regarding ads shown from my app, the fields are  (ad_id,user_source_site,user_location,day_date,specific_hour,views,clicks) each row is distinct by the first 5 fields since i need to show different types of stats..

because each new line created may or may not be in the DB i have to run a upsert command (update or insert) on each row..

This leads to very poor performance..  Do have any ideas about how i can make this script more efficient?
+1 vote

Is the log coming from NGINX or (since you mention Django below) coming solely from the Django application.

If the logging is from the Django application only, you should be able to have it connect to the database and write directly to it.

answer Aug 14, 2013 by Mandeep Sehgal
the log is from NGINX..
Similar Questions
+1 vote

Is there a way to make python script that connects to mySQL DB ask for a password on the:

conn = mdb.connect(host, user)
+1 vote

I want to run a python script which contains simple form of html on firefox browser , but don't know what should be the configuration on ubuntu 12.04 to run this script i.e CGI configuration.

My code is

ubder 
in /var/www/cgi-bin/forms__.py

#!/usr/bin/env python
import webapp2

form ="""

 """

class MainPage(webapp2.RequestHandler):
 def get(self):
 #self.response.headers['Content-Type'] = 'text/plain'
 self.response.out.write(form)

app = webapp2.WSGIApplication([('/', MainPage)], debug=True)
+1 vote

Attempting to import modules from Shogun to python from a non-standard python directory i.e. from my /home/xxx directory. Is there a way on ubuntu to selectively some modules, scripts, data from one directory and others modules, scripts from another directory. In other words, is there a file(s) that provide pointers to where these modules are located.

+2 votes

I have couple of databases named as abc, xyz. If I want to use database "xyz" then type command "use xyz". Command in console show output as "switched to db xyz" but still an user uses commands start with "db" rather than actual name "xyz" why ?

+1 vote

I am having a problem writing a record to a database. I checked my db connection and that is fine. It just seems this one object will not write to the db

I checked the objects contents and all fields are there. I am using a prepared statement in static class as my data access layer.

...