TY - JOUR

T1 - Learning DFA from simple examples

AU - Parekh, Rajesh

AU - Honavar, Vasant

N1 - Funding Information:
The authors wish to thank Jack Lutz for introducing them to Kolmogorov complexity, Giora Slutzki for several helpful discussions on automata theory, Pierre Dupont for a careful review of an earlier draft of this paper and several helpful suggestions, Colin de la Higuera and Franc¸ois Denis for discussions which helped clarify some of the issues related to collusion, and several anonymous referees for their insightful comments that have helped us to improve the paper significantly. Vasant Honavar is grateful to the National Science Foundation for the research grants IRI-9409580 and IRI-9643299 that partially supported this work.

PY - 2001/7

Y1 - 2001/7

N2 - Efficient learning of DFA is a challenging research problem in grammatical inference. It is known that both exact and approximate (in the PAC sense) identifiability of DFA is hard. Pitt has posed the following open research problem: "Are DFA PAC-identifiable if examples are drawn from the uniform distribution, or some other known simple distribution?" (Pitt, in Lecture Notes in Artificial Intelligence, 397, pp. 18-44, Springer-Verlag, 1989). We demonstrate that the class of DFA whose canonical representations have logarithmic Kolmogorov complexity is efficiently PAC learnable under the Solomonoff Levin universal distribution (m). We prove that the class of DFA is efficiently learnable under the PACS (PAC learning with simple examples) model (Denis, D'Halluin & Gilleron, STACS'96 -Proceedings of the 13th Annual Symposium on the Theoretical Aspects of Computer Science, pp. 231-242, 1996) wherein positive and negative examples are sampled according to the universal distribution conditional on a description of the target concept. Further, we show that any concept that is learnable under Gold's model of learning from characteristic samples, Goldman and Mathias'polynomial teachability model, and the model of learning from example based queries is also learnable under the PACS model.

AB - Efficient learning of DFA is a challenging research problem in grammatical inference. It is known that both exact and approximate (in the PAC sense) identifiability of DFA is hard. Pitt has posed the following open research problem: "Are DFA PAC-identifiable if examples are drawn from the uniform distribution, or some other known simple distribution?" (Pitt, in Lecture Notes in Artificial Intelligence, 397, pp. 18-44, Springer-Verlag, 1989). We demonstrate that the class of DFA whose canonical representations have logarithmic Kolmogorov complexity is efficiently PAC learnable under the Solomonoff Levin universal distribution (m). We prove that the class of DFA is efficiently learnable under the PACS (PAC learning with simple examples) model (Denis, D'Halluin & Gilleron, STACS'96 -Proceedings of the 13th Annual Symposium on the Theoretical Aspects of Computer Science, pp. 231-242, 1996) wherein positive and negative examples are sampled according to the universal distribution conditional on a description of the target concept. Further, we show that any concept that is learnable under Gold's model of learning from characteristic samples, Goldman and Mathias'polynomial teachability model, and the model of learning from example based queries is also learnable under the PACS model.

UR - http://www.scopus.com/inward/record.url?scp=0035399645&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0035399645&partnerID=8YFLogxK

U2 - 10.1023/A:1010822518073

DO - 10.1023/A:1010822518073

M3 - Article

AN - SCOPUS:0035399645

VL - 44

SP - 9

EP - 35

JO - Machine Learning

JF - Machine Learning

SN - 0885-6125

IS - 1-2

ER -