text analysis and semi supervised learning 1038111