top button
Flag Notify
    Connect to us
      Site Registration

Site Registration

Python: Reading all buffered bytes without blocking

+2 votes
452 views

Is it possible to say to a BufferedReader stream "give me all the bytes you have available in the buffer, or do one OS call and give me everything you get back"? The problem is that the "number of bytes" argument to read1() isn't optional, so I can't do available_bytes = fd.read1().

I need this because I want to decode the returned bytes from UTF-8, and I *might* get a character split across the boundary of any arbitrary block size I choose. (I'm happy to ignore the possibility that the *source* did a flush part-way through a character).

I don't really want to have to do incremental encoding if I can avoid it - it looks hard...

posted Mar 3, 2015 by Ankit

Looking for an answer?  Promote on:
Facebook Share Button Twitter Share Button LinkedIn Share Button
Just specify large size.
Thanks. Looking at the source, it appears that a large size will allocate a buffer that size for the data even if the amount actually read is small (thinking about it, of couse it has to, doh, because the syscall needs it).

Anyway, it's a pretty microscopic risk in practice, and when I looked at them, the incremental codecs (codecs.iterdecode) really aren't that hard to use, so I can do it that way if it matters enough.

For what it's worth, in case anyone wants to know, incremental decoding looks like this:

def get():
 while True:
 data = process.stdout.read(1000)
 if not data:
 break
 yield data
for data in codecs.iterdecode(get(), encoding):
 sys.stdout.write(data)
 sys.stdout.flush()

Similar Questions
+2 votes

I am using python and pyserial to talk to an embedded pic processor in a piece of scientific equipment. I sometimes find the when I construct the bytes object to write it adds an extra f to the first byte.

For example if I have bx03x66x02x01xaaxbb it evaluates to bx03fx02x01xaaxbb, which doesnt even seem valid. Can anyone shine some light this?

+1 vote

I'm using Python 2.7 under Windows and am trying to run a command line program and process the programs output as it is running. A number of web searches have indicated that the following code would work.

import subprocess

p = subprocess.Popen("D:PythonPython27Scriptspip.exe list -o",
 stdout=subprocess.PIPE,
 stderr=subprocess.STDOUT,
 bufsize=1,
 universal_newlines=True,
 shell=False)
for line in p.stdout:
 print line

When I use this code I can see that the Popen works, any code between the Popen and the for will run straight away, but as soon as it gets to the for and tries to read p.stdout the code blocks until the command
line program completes, then all of the lines are returned. Does anyone know how to get the results of the program without it blocking?

0 votes

I have a Ubuntu server running NGINX that logs data for me. I want to write a python script that reads my customized logs and after a little rearrangement save the new data into my DB (postgresql).

The process should run about every 5 minutes and i'm expecting large chunks of data on several 5 minute windows..

My plan for achieving this is to install python on the server, write a script and add it to cron.

My question is what the simplest way to do this? should I use any python frameworks?
For my python app I'm using Django, but on this server I just need to read a file, do some manipulation and save to DB.

...