1) Install Python
Download the Microsoft installer package for the latest 2.7.X version of Python. I’m going to be using Python 2.7 because it is more compatible with some of the tools I need. At the time of this writing this is Python version 2.7.8.
http://www.python.org/downloads
Accept all the defaults for installation. This will place the Python install in C:\Python27
2) Install pip for Python
In order to make installing Python packages much easier I’m going to install pip.
Download the file https://bootstrap.pypa.io/get-pip.py from the site https://pip.pypa.io/en/latest/installing.html . This is a python-based installer for pip, which will make life easier hereto forth.
3) Install and setup virtualenv
This is super simple now that we have “pip” in place. The python package “virtualenv” is a way to isolate the python environment for better control over versioning and is considered a Python best practice. It also has the benefits of isolating libraries and not requiring root privileges. You can learn more about virtualenv at http://pypi.python.org/pypi/virtualenv , but we wont be going there for the installation. We’ll use “pip install” instead.
Make sure to be issuing this command from the Windows command prompt!
pip install virtualenv
Now that the virtualenv is installed, I’m going to create a new environment called PlayPyNLP. Go to a Windows Command prompt and change the directory to a place where you want to have your Python environment installed. I’m going to pick C:\Documents and Settings\Chris\workspace because that’s where I have my “Eclipse” workspace pointing. Again, make sure you execute this command from the Windows command prompt.
virtualenv –no-site-packages PlayPyNLP
The option –no-site-packages (there should be 2 dashes) will create an isolated version of the environment and will copy over the packages as opposed to link to them.
Next up, we are going to use the virtualenv we just created. To do this it must be Activated. First, switch into the newly created directory.
cd PlayPyNLP
Now Activate the environment.
Scripts\activate
You should see that the command prompt has changed such that the name of the environment is placed in parentheses ahead of the path. This means you are in your virtual environment. (If you want to go back to the global environment you would type Scripts\deactivate).
To check that everything is working correctly so far, let’s install numpy into the virtual environment PlayPyNLP.
You should already be in a Windows Command prompt with the virtual environment activated. Note that installing packages must be done in a Windows Command prompt, not a Python shell.
Type pip install numpy . If everything went well it should show a lot of stuff on the screen and end up saying “Successfully installed numpy”.
4) Install Python PyMongo Driver
Now that all the configuration work is done for the python install and the virtual environment, it’s time to hook it all up to MongoDB.
I’m assuming that MongoDB has already been installed, configured, and running according to the previous post
Now let’s install the PyMongo driver. Make sure that PlayPyNLP is activated and is the current working directory. Then type
pip install pymongo
In order to see if it worked successfully and that you will be able to import pymongo in a script type the command below. A “-c” option on Python means pass a command into the Python command parser.
python -c “import pymongo”
If nothing happens this means it works. If an ImportError occurs it means it didn’t work.
5) Connect MongoDB and Python
Now let’s write a script that connects up MongoDB and Python. I’m going to create a new directory inside PlayPyNLP called src in which I’m going to store my user scripts.
mkdir src
Now I’m going to create a script inside that directory called test_mongodb_connection.py. The script that I’m going to use originally comes from the O’Reilly book “MongoDB & Python“. The source code for the book is located on the O’Reilly site, but I’ve included it inline for convenience. Note that this assumes MongoDB is running on port 27017, which is the default port for MongoDB.
http://examples.oreilly.com/0636920021513/Chapter02/ch02-02-connecting-to-mongodb.py
""" An example of how to connect to MongoDB """ import sys from pymongo import Connection from pymongo.errors import ConnectionFailure def main(): """ Connect to MongoDB """ try: c = Connection(host="localhost", port=27017) print "Connected successfully" except ConnectionFailure, e: sys.stderr.write("Could not connect to MongoDB: %s" % e) sys.exit(1) if __name__ == "__main__": main()
Once that script is created, execute it by typing the command (in the Windows Command prompt)
python srctest_mongodb_connection.py
If it reports “Connected successfully” everything is working.
Ready for more fun? How about checking out:
Installing scikit-learn in a Python Virtualenv to do General Purpose Machine Learning
THINGS THAT GO WRONG
1) Forgetting to put quotation marks around commands that are issued via the Windows Command prompt.
OK: python -c “import pymongo”
NOT OK: python -c import pymongo
2) Could not connect to MongoDB
If the instance on MongoDB isn’t running it will display a message like:
“Could not connect to MongoDB: could not connect to localhost:27017: [Errno 10061] No connection could be made because the target machine actively refused it”
3) Using ‘pip install’ inside Python rather than at the command prompt
You have to be at the command prompt to use easy_install
4) Not waiting long enough for numpy to compile
It took me 5 minutes. Be patient.
Speak Your Mind