top button
Flag Notify
    Connect to us
      Site Registration

Site Registration

Natural Language Processing with Python .dispersion_plot returns nothing

0 votes
634 views

I'm in the first chapter of Natural Language Processing with Python and am trying to run the example .dispersion_plot. I am using Python 2.7.4 (Anaconda) on Mac OSX 10.8.

When I load all of the necessary modules and try to create the dispersion plott, I get no return - no plot, no error message, not even a new >>> prompt, just a blinking cursor under the last line I typed. Here is what I've been doing:

[~]: python
Python 2.7.4 |Anaconda 1.5.1 (x86_64)| (default, May 9 2013, 12:12:00) 
[GCC 4.0.1 (Apple Inc. build 5493)] on darwin
Type "help", "copyright", "credits" or "license" for more information.

>>> import numpy
>>> import matplotlib
>>> import nltk
>>> from nltk.book import *
*** Introductory Examples for the NLTK Book ***
Loading text1, ..., text9 and sent1, ..., sent9
Type the name of the text or sentence to view it.
Type: 'texts()' or 'sents()' to list the materials.
text1: Moby Dick by Herman Melville 1851
text2: Sense and Sensibility by Jane Austen 1811
text3: The Book of Genesis
text4: Inaugural Address Corpus
text5: Chat Corpus
text6: Monty Python and the Holy Grail
text7: Wall Street Journal
text8: Personals Corpus
text9: The Man Who Was Thursday by G . K . Chesterton 1908
>>> text4.dispersion_plot(["citizens", "democracy", "freedom", "duties", "America"])

...and nothing. I can't paste it but my cursor is just blinking under my last command with no prompt. So far the other example commands from the chapter (e.g. .concordance) work fine, so I'm guessing the problem is something with numpy or matplotlib. I had a heck of a time getting matplotlib installed correctly (kept getting errors saying that it wasn't installed even when I had installed it), but since switching to the Anaconda distro, which had those prepackaged, I haven't gotten any module errors.

Any advice??

posted Jun 17, 2013 by anonymous

Share this question
Facebook Share Button Twitter Share Button LinkedIn Share Button

2 Answers

+1 vote

The dispersion_plot() method uses pylab.show() to display the data (in another window). Only when you close that window the interactive interpreter becomes responsive again.

If you didn't overlook that window: do you run into the same problem with

>>> import pylab
>>> pylab.plot([1, 2, 3], [3, 1, 2])
[]
>>> pylab.show()

? If so, choose another backend. I've not tried, but it seems straight-forward,

answer Jun 17, 2013 by anonymous
0 votes

How long did you wait for results before interrupting the command?
How large is text4? It might just take a while to process.

answer Jun 17, 2013 by anonymous
I let it run for 5-10 minutes. It's doing this no matter which text I try to run the dispersion plot on
Similar Questions
+4 votes

I have a list of tuples where the number of rows in the list and the number of columns in tuples of the list will not be constant. i.e.

list = [(a1,b1, …z1), (a2,b2, …, z2),…. ,(am,bm, … , zm )]. It can be compared to the SQL results, as the number of columns change in the sql, the number of columns change in the list of tuples as well.

I have to iterate through each element of each tuple in the list, perform some checks on the value, convert it and return the modified values as a new list of tuples.

i.e.  list_value = [(‘name1’, 1234, ‘address1’ ), (‘name2’, 5678, ‘address2’), (‘name3’, 1011, ‘addre”ss3’)] 

I need to access each value to check if the value contains a double quote and enclose the string containing double quote with double quotes. The transformation should return

list_value = [(‘name1’, 1234, ‘address1’ ), (‘name2’, 5678, ‘address2’), (‘name3’, 1011, ‘”addre”ss3”’)] 

The simplest approach for me would be to do this:

mod_val = [transform(row) for row in list_value] 
def transform(row): 
   mod_list=[] 
   while index < len(row): 
...    if isinstance(row[index],basestring): 
...       if " in row[index]: 
...          mod_list.append("+row[index]+") 
...    else: 
...       mod_list.append(row[index]) 
...    index = index+1 
... return mod_list 

Is there a way to make the code concise using list comprehension?

0 votes

I am trying to use mitmproxy behind a company proxy that requires a user/password login.

The setup is: Local PC's browser -> mitmproxy (on local PC) -> company proxy -> internet.

Based on this SO thread, this is how you use mitmproxy within a Python program. This example works fine when there's no proxy.

from mitmproxy.options import Options
from mitmproxy.proxy.config import ProxyConfig
from mitmproxy.proxy.server import ProxyServer
from mitmproxy.tools.dump import DumpMaster

class Addon(object):
    def __init__(self):
        pass

    def request(self, flow):
        # examine request here 
        pass

    def response(self, flow):
        # examine response here
        pass


if __name__ == "__main__":

    options = Options(listen_host='0.0.0.0', listen_port=8080, http2=True)
    m = DumpMaster(options, with_termlog=False, with_dumper=False)
    config = ProxyConfig(options)

    m.server = ProxyServer(config)
    m.addons.add(Addon())

    try:
        print('starting mitmproxy')
        m.run()
    except KeyboardInterrupt:
        m.shutdown()

Assuming the company proxy is at IP "1.2.3.4" port 3128 and requires a login USER and PASSWORD, how can I change this script to have mitproxy use that proxy instead of going to the internet directly?

Addition info: I am not using mitmdump with the script-parameter to run this script. The goal is to run this from Python 3.8 with a pip-installed mitmproxy

...