We provide an update on JSegMan, an interactive system to extend the ACT-R cognitive architecture to interact with dynamic interfaces based on the screen contents and generating input for the operating system directly. Current ACT-R models typically interact with the world through ACT-R's device interface?an abstract representation of the world that is based on a simulated Lisp environment provided with ACT-R, or by instrumenting interfaces. In JSegMan, computer vision pattern matching algorithms and visual patterns extend the ACT-R cognitive architecture. With JSegMan, models directly move the cursor on the screen, click on application GUI objects on PCs, and type through the use of existing Java libraries. Implementing users' visual search strategies and input abilities for different visual objects enables the detailed modeling of interactive tasks on any interface. The visual pattern matching algorithms serve two goals: to simulate user behavior in interactive tasks and to create representations of visual stimuli. We tested our visual pattern matching approach by using it with an existing model for a long spreadsheet task. We found that the revised model more accurately predicted a 20-min task by entirely performing the task on an uninstrumented and unmodified interface.