Recently, I was working on a text analytics application to search documents using Python. To reduce the time taken to display results, I split the application into two stages – corpus creation and search.
In the first stage, I parsed the documents and stored them into a corpus, using suitable pickle files. In the second stage, I loaded the corpus data from the pickle files and performed search using the user provided text.
But I faced some difficulty handling pickle files.
Apparently, pickle files store a ‘context’ when they are written. And this ‘context’ created trouble. When I created a pickle file from a separate program and tried to read it using my text analytics application, it gave an error. The solution is to either incorporate the corpus update functionality in the main application, or to update the context information such that it matches that of the main application.