Quick start

Basic example

We have a collection of movies and know that some of them are duplicates. In order to identify these duplicates we define a set of rules that decide whether two movies match or not. The rule set may look like this:

rules_text = """
discarded: Those whose titles are very different
titles.similarity <= 50
=> mismatch

perfect: Those matching almost perfectly
titles.similarity >= 90 and years.difference < 2
=> match
"""

After assigning the rule definition to a string variable, it must be compiled into its equivalent object representation as follows:

from mucho.dsl.compiler import Compiler

rules_text = """
discarded: Those whose titles are very different
titles.similarity <= 50
=> mismatch

perfect: Those matching almost perfectly
titles.similarity >= 90 and years.difference < 2
=> match
"""

compiler = Compiler()
rules = compiler.compile(rules_text, debug=True)

Let’s evaluate the rules to determine if two specific movies are a match. To do so we need some context telling properties about the result of comparing the movies:

from mucho.comparison import ComparisonResult
from mucho.dsl.compiler import Compiler

rules_text = """
discarded: Those whose titles are very different
titles.similarity <= 50
=> mismatch

perfect: Those matching almost perfectly
titles.similarity >= 90 and years.difference < 2
=> match
"""

compiler = Compiler()
rules = compiler.compile(rules_text, debug=True)

context = ComparisonResult(
    titles=ComparisionResult(similarity=95),
    years=ComparisonResult(difference=1))

Once the context is available and the rule set is compiled, the mucho virtual machine can perform the evaluation:

from mucho.comparison import ComparisonResult
from mucho.dsl.compiler import Compiler
from mucho.dsl.virtualmachine import VirtualMachine

rules_text = """
discarded: Those whose titles are very different
titles.similarity <= 50
=> mismatch

perfect: Those matching almost perfectly
titles.similarity >= 90 and years.difference < 2
=> match
"""

compiler = Compiler()
rules = compiler.compile(rules_text, debug=True)

context = ComparisonResult(
    titles=ComparisionResult(similarity=95),
    years=ComparisonResult(difference=1))

virtual_machine = VirtualMachine()
satisfied_rule = virtual_machine.run(rules, context)
if satisfied_rule:
    print("Rule '{0}' was satisfied with result: {1}".format(
        satisfied_rule.id, satisfied_rule.result.value))
else:
    print("No rule was satisfied")

The rule with name perfect was satisfied by the context, meaning that the two movies whose comparison result was used in the evaluation are a match.

The comparison result used as evaluation context can be obtained in many different ways. Mucho provides an approach to define comparators that obtain this result.