Rapid growth of RDF data in the Linked Open Data (LOD) cloud offers unprecedented opportunities for analyzing such data using machine learning algorithms. The massive size and distributed nature of LOD cloud present a challenging machine learning problem where the data can only be accessed remotely, i.e. through a query interface such as the SPARQL end-point of the data store. Existing approaches to learning classifiers from RDF data in such a setting fail to take advantage of RDF schema (RDFS) associated with the data store that asserts subclass hierarchies which provide information that can potentially be exploited by the learner. Against this background, we present a general approach that augments an existing directed graphical model with hidden variables that encode subclass hierarchies via probabilistic constraints. We also present an algorithm ProbAVT that adopts the variational Bayesian expectation maximization approach to efficiently learn parameters in such settings. Our experiments with several synthetic and real world datasets show that: (i) ProbAVT matches or outperforms its counterpart that does not incorporate background knowledge in the form of subclass hierarchies; (ii) ProbAVT remains competitive compared to other state-of-art models that incorporate subclass hierarchies, and is able to scale up to large hierarchies consisting of over tens of thousands of nodes.