The demand of scoring natural language responses has created a need for new computational tools that can be applied to automatically grade student essays. Systems for automatic essay assessment have been commercially available since 1990's. However, the progress in the field was obstructed by a lack of qualitative information regarding the effectiveness of such systems. Most of the research in automatic essay grading has been associated with English writing due to its widespread use and the availability of more learner collection and language processing software for the language. In addition, there is large number of commercial software for grading programming assignments automatically. In this work, we investigate document semantic similarity based on Latent Semantic Analysis (LSA) and on Latent Dirichlet Allocation (LDA). We use an open-source Python software, Gensim, to develop and implement an essay grading system able to compare an essay to an answer-key and assign to it a grade based on semantic similarity between the two. We test our tool on variable-size essays and conduct experimental tests to compare the results obtained from human grader (professor) and those obtained from the automatic grading system. Results show high correlation between the professor grades and the grades assigned by both modeling techniques. However, LSA-based modeling showed more promising results than the LDA-based method.