Topic modeling and text similarity