> *(
Z
F2yildizol@cmpe.boun.edu.tr/00DTimes New Roman5t\ )0tY0DTahomaew Roman5t\ )0tY0" DWingdingsRoman5t\ )0tY00DSymbolgsRoman5t\ )0tY0a.
@n?" dd@ @@``P)!"%")U
++0e0eA@A5%8c8c
?1d0u0@Ty2 NP'p<'p@A)BCDE?0@8%&7uʚ;2Nʚ;g4CdCd )0h
ppp@<4!d!d0$5<4dddd0$5?%!Learning Rules from DataOlcay Taner Y1ld1z, Ethem Alpayd1n
yildizol@cmpe.boun.edu.tr
Department of Computer Engineering
Boazii University2#7
0#<Rule Induction?oDerive meaningful rules from data
Mainly used in classification problems
Attribute types (Continuous, Discrete)Vp1oRulesDisjunction: conjunctions are binded via OR's
Conjunction: propositions are binded via AND's
Proposition: relation on an attribute
Attribute is Continuous (defines a subinterval)
Attribute is Discrete (= one of the values of the attribute)6ml How to generate rules?GRule Induction Techniques
Via Trees
C4.5Rules
Directly from Data
Ripperb
GC4.5Rules (Quinlan, 93)}Create decision tree using C4.5
Convert the decision tree to a ruleset by writing each path from root to the leaves as a rule~~}' C4.5Rules RIPPER (Cohen, 95)
6Split (pos, neg) into growset and pruneset
Rule := grow conjunction with growset
Add propositions one by one
IF (x1 > q1) AND (x2 > q2) AND (x2 < q3) THEN Y = OK
Rule := prune conjunction with pruneset
Remove propositions
IF (x1 > q1) AND (x2 < q3) THEN Y = OK
If MDL < Best MDL + 64
Add conjunction
Else
BreakRQZQZ(Z;ZZZZZQ
(
r P @Ripper (Optimization)>Repeat K times
For each rule
IF (x1 > q1) AND (x2 < q3) THEN Y = OK
Generate revisionrule by adding propositions
IF (x1 > q1) AND (x2 < q3) AND (x1 > q3) THEN Y = OK
Generate replacementrule by regrowing
IF (x1 > q4) THEN Y = OK
Compare current rule with revisionrule and replacementrule
Take the best according to MDL>"
2
+
[" A
: hMinimum Description LengthTDescription Length of a Ruleset
Description Length of Rules
S = k + k log2 (n / k) + (n k) log2 (k / (n k))
Description Length of Exceptions
S = log2 (D + 1)
+ fp (log2 (e / 2C))
+ (C fp) (log2 (1  e / 2C))
+ fn (log2 (fn / 2U))
+ (U fn) (log2 (1  fn / 2U)) 8! ! !*Ripper*Finding best condition is done by trying all possible split points (timeconsuming)
Shortcut: Linear Discriminant Analysis
Split point is calculated analytically
To be more robust
Instances further than 3 s are removed
If number of examples < 20, shortcut not used8U<Experiments29 datasets from UCI repository are used
10 fold crossvalidation
Comparison done using onesided t test
Comparison of three algorithms C4.5Rules, Ripper, Ripper*
Comparison based on
Error Rate
Complexity of the rulesets
Learning TimeDZ4Z
4!Error Rate (I)Ripper and its variant have better performance than C4.5Rules
Ripper* has similar performance compared to Ripper
C4.5Rules has advantage when the number of rules are small (Exhaustive Search)6D
z
Error Rate (II)"Ruleset Complexity (I)Ripper and Ripper* produce significantly small number of rules compared to C4.5Rules
C4.5Rules starts with an unpruned tree, which is a large amount of rule to start6
Ruleset Complexity (II)#Learning Time (I)lRipper* better than Ripper, which is better than C4.5Rules
C4.5Rules O(N3)
Ripper O(Nlog2N)
Ripper* O(NlogN)m
>
l Learning Time (II)$
Conclusion
Comparison of two rule induction algorithms C4.5Rules and Ripper
Proposed a shortcut in learning conditions using LDA (Ripper*)
Ripper is better than C4.5Rules
Ripper* improves learning time of Ripper without decreasing its performance@}
(
E ` 33PP` 3333` ___MMM` 13` 333fpKNāvI` j@v۩ῑH` Q_{>?" dd@,?n<d@ `7 `2@`7``2 n?" dd@ @@``PR @ ` `p>>
(
<m"
@
Tpd"
@
<s"U_
@
TTvd">&
@
NLy"P
@
<l"p
@
Cxd?d?"bUv
@
6 "U
T Click to edit Master title style!
!$
0Ԅ "
RClick to edit Master text styles
Second level
Third level
Fourth level
Fifth level!
S
6 "@
B*
6 "@`
D*
6 "`
D*B
s*h ? 3333 Blends%
0k (
T +
"+bb P@
#"Dwoh
s*"PP
Bd" P@bb P
0
#"Nyh
s*"P
Bd"P
0z
<" a*h
s*"
f?d?"+)
<쇖 ?"pP
T Click to edit Master title style!
!
0 " `
W#Click to edit Master subtitle style$
$
6 "`p
F*
6( "`p
H*
6薖 "`
H*B
s*h ? 3333 0(
x
c$PpP
x
c$
@
H
0h ? ̙33
@!!O(
x
c$3 U
x
c$4
XB
0D
P
XB
@
0D PP
XB
0DPXB
0DPXB
@
0D pp
0l7
LName
<; v
NIncome
<> :
U
Owns a house?
XB
@
0D
<?
VMarital status
0C
?
WAli
0F
P25,000 $
0J
KYes
0M
t
OMarried
0L`
LVeli
0Tv
P18,000 $
<TX
JNo
0[
OMarried
XB
0D
P
XB
@
0D 00
Z0_X)? 00
QDefault
XB
@
0D
ZcX)?
p
VNo$
ZgX)?0
WYes$
XB
@
0D PPpXB
@
0D PPXB
@
0D ppXB
@
0D
XB
@
0D 00XB
!@
0D H
0h ? xiPfr}OA333
P*(
x
c$ U
r
S,
0
H
0h ? xiPfr}OA333
` 0(
x
c$ U
x
c$
H
0h ? xiPfr}OA333
pL6(
L~
L s*\p U
x
L c$h
0p@
H
L0h ? xiPfr}OA333*
**A+*(
r
S U
F @
A
fB
B
6Dpp
Z
C
s*00Z
D
s*0 Z
E
s*p`Z
F
s*PpZ
G
s*P Z
H
s*
p
I
0x@
rx2 : savings6
Z
J
s*@
Z
K
s*
@
L
TX)?p
0
M
TX)?
@
N
TX)?
O
TX)?0
P
TX)? P
Q
TX)?P
R
TX)?``
S
TX)?p
T
TX)?` `
Z
U
s*Pp
V
TX)? PB
W
`D3fԔX)?
B
X
`DԔX)?
0
B
Y
ZDX)? B
ZB
ZDX)?
fB
[
6D``
\
<X4
xx1 : yearlyincome6
B
]
ZDX)?0B
^
ZDX)? 0 B
_
ZDX)?
0
`
`(X)?
Tq1"
fB
a
6Dpp
B
b
ZDX)? B
c
ZDX)?B
d
ZDX)?
e
ZX)?JR
R
OK
DEFAULT
f
`X)?
Tq2"
F PAP
g
N `p
hPI 2
i
ZX)?`p +
j
`PX)?`
x1 > q1P
N `p
ki 2
l
ZX)?`p +
m
`X)?`
x2 > q2P
B
nB
`DX)?0N `p
oA 2
p
ZX)?`p
q
`X)?`
r y = 0<
B
r
`DX)?@ N `p
sP
2
t
ZX)?`p
u
`X)?`
h y = 12
N `p
v1 P
2
w
ZX)?`p
x
`X)?`
r y = 0<
B
yB
`DX)? B
z
`DX)?`
{
`lX)?0XP
gyes
(

`@X)? .
Zno
}
`X)?0>
P
Zno
~
`8X)?
[yes
f1 ?
iRules:
IF (x1 > q1) AND (x2 > q2) THEN Y = OK
IF (x1 < q1) OR ((x1 > q1) AND (x2 < q2)) THEN
Y = DEFAULTW
H
0h ? 3333
T(
T~
T s*# U
[
T
f871 ?0P
Learn rule for each class
Objective class is positive, other classes are negative
Two Steps
Initialization
Learn rules one by one
Immediately prune learned rule
Optimization
Since search is greeedy, pass k times over the rules to optimize
8
6
A8
6
AH
T0h ? xiPfr}OA333)
Xi(
Xx
X c$
0
X
6 U
KRIPPER (Initialization)H
X0h ? xiPfr}OA333
\<(
\~
\ s*I U
~
\ s*J
P
H
\0h ? xiPfr}OA333
l*(
lx
l c$XN U
r
l SO
P
H
l0h ? xiPfr}OA333
p0(
px
p c$v U
x
p c$$
H
p0h ? xiPfr}OA333
x0(
xx
x c$T U
x
x c$
H
x0h ? xiPfr}OA333
0(
x
c$a U
x
c$hb
H
0h ? xiPfr}OA333"
L"D"(!(
x
 c$f U
$!n p
#"&p

f[1?6
R
@`

f1?
6
e9 @`

f@l1?
e5 @`

fw1?
h12 @`

f4r1?p
kTotal @`

f1?6
h10 @`

f1?
6
e @`

f1?
e1 @`

fD1?
h10 @`

fp1?p
wRipper* @`

f̭1?6
h12 @`

fH1? 6
e2 @`

f1?
e @`

f1?
h12 @`

f,1?p
lRipper @`

f01?6
e7 @`

f1?6
e7 @`

f1?
e4 @`

f1?
e @`

f01?p
o C4.5Rules
@`

fh1?6
kTotal @`

f21?6
wRipper* @`

f021?
lRipper @`

fP21?
o C4.5Rules
@`

f21?p
R
@`B

To
?p ~B

N1
?p ~B

N1
?p ~B

N1
?p
~B
!
N1
?p
B
"
To
?p B
#
To
?pp~B
$
N1
?~B
%
N1
?~B
&
N1
?~B
'
N1
?66B
(
To
? H
0h ? xiPfr}OA333
0(
x
c$2 U
2
x
c$2
2
H
0h ? xiPfr}OA333"
X"P" (*!(
~
s*h12 U
2
*!n p
*#"&p
f921?6
R
@`
f1?
6
h11 @`
f@21?
e3 @`
fH21?
h27 @`
f1?p
kTotal @`
fU21?6
h27 @`
f\]21?
6
e @`
f^21?
e2 @`
fpf21?
h27 @`
f(u21?p
wRipper* @`
f21?6
h27 @`
f}21? 6
h10 @`
fH21?
e @`
f@21?
h26 @`
fd21?p
lRipper @`
f21?6
e1 @`
f21?6
e1 @`
f21?
e1 @`
f21?
e @`
fP21?p
o C4.5Rules
@`
f21?6
kTotal @`
f,21?6
wRipper* @`
f21?
lRipper @`
f21?
o C4.5Rules
@`
f21?p
R
@`B
To
?p ~B
N1
?p ~B
N1
?p ~B
N1
?p
~B
!
N1
?p
B
"
To
?p B
#
To
?pp~B
$
N1
?~B
%
N1
?~B
&
N1
?~B
'
N1
?66B
(
To
? H
0h ? xiPfr}OA333
00(
x
c$2 U
2
x
c$2
2
H
0h ? xiPfr}OA333"
X"P"@(*!(
~
s*X2 U
2
*!n p
*#"&p
f$41?6
R
@`
f41?
6
e0 @`
f@
41?
h13 @`
fD41?
h25 @`
fH41?p
kTotal @`
f&41?6
h27 @`
f'41?
6
e @`
f541?
h13 @`
f641?
h25 @`
fE41?p
wRipper* @`
fdM41?6
h23 @`
fT41? 6
e0 @`
fV41?
e @`
fc41?
h23 @`
f^41?p
lRipper @`
f q1) AND (x2 > q2) AND (x2 < q3) THEN Y = OK
Rule := prune conjunction with pruneset
Remove propositions
IF (x1 > q1) AND (x2 < q3) THEN Y = OK
If MDL < Best MDL + 64
Add conjunction
Else
BreakRQZQZ(Z;ZZZZZQ
(
r P @Ripper (Optimization)>Repeat K times
For each rule
IF (x1 > q1) AND (x2 < q3) THEN Y = OK
Generate revisionrule by adding propositions
IF (x1 > q1) AND (x2 < q3) AND (x1 > q3) THEN Y = OK
Generate replacementrule by regrowing
IF (x1 > q4) THEN Y = OK
Compare current rule with revisionrule and replacementrule
Take the best according to MDL>"
2
+
[" A
: hMinimum Description LengthTDescription Length of a Ruleset
Description Length of Rules
S = k + k log2 (n / k) + (n k) log2 (k / (n k))
Description Length of Exceptions
S = log2 (D + 1)
+ fp (log2 (e / 2C))
+ (C fp) (log2 (1  e / 2C))
+ fn (log2 (fn / 2U))
+ (U fn) (log2 (1  fn / 2U)) 8! ! !*Ripper*Finding best condition is done by trying all possible split points (timeconsuming)
Shortcut: Linear Discriminant Analysis
Split point is calculated analytically
To be more robust
Instances furthe
!"#$%&'()*+,./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{}~Root EntrydO)0#Q[MCurrent User8SummaryInformation('PowerPoint Document(]DocumentSummaryInformation8d5t\ )0tY0a.
@n?" dd@ @@``P)!"%")U
++0e0eA@A5%8c8c
?1d0u0@Ty2 NP'p<'p@A)BCDE?0@8%&7uʚ;2Nʚ;g4CdCd )0h
ppp@<4!d!d0$5<4dddd0$5?%!Learning Rules from DataOlcay Taner Y1ld1z, Ethem Alpayd1n
yildizol@cmpe.boun.edu.tr
Department of Computer Engineering
Boazii University2#7
0#<Rule Induction?oDerive meaningful rules from data
Mainly used in classification problems
Attribute types (Continuous, Discrete)Vp1oRulesDisjunction: conjunctions are binded via OR's
Conjunction: propositions are binded via AND's
Proposition: relation on an attribute
Attribute is Continuous (defines a subinterval)
Attribute is Discrete (= one of the values of the attribute)6ml How to generate rules?GRule Induction Techniques
Via Trees
C4.5Rules
Directly from Data
Ripperb
GC4.5Rules (Quinlan, 93)}Create decision tree using C4.5
Convert the decision tree to a ruleset by writing each path from root to the leaves as a rule~~}' C4.5Rules RIPPER (Cohen, 95)
6Split (pos, neg) into growset and pruneset
Rule := grow conjunction with growset
Add propositions one by one
IF (x1 > q1) AND (x2 > q2) AND (x2 < q3) THEN Y = OK
Rule := prune conjunction with pruneset
Remove propositions
IF (x1 > q1) AND (x2 < q3) THEN Y = OK
If MDL < Best MDL + 64
Add conjunction
Else
BreakRQZQZ(Z;ZZZZZQ
(
r P @Ripper (Optimization)>Repeat K times
For each rule
IF (x1 > q1) AND (x2 < q3) THEN Y = OK
Generate revisionrule by adding propositions
IF (x1 > q1) AND (x2 < q3) AND (x1 > q3) THEN Y = OK
Generate replacementrule by regrowing
IF (x1 > q4) THEN Y = OK
Compare current rule with revisionrule and replacementrule
Take the best according to MDL>"
2
+
[" A
: hMinimum Description LengthTDescription Length of a Ruleset
Description Length of Rules
S = k + k log2 (n / k) + (n k) log2 (k / (n k))
Description Length of Exceptions
S = log2 (D + 1)
+ fp (log2 (e / 2C))
+ (C fp) (log2 (1  e / 2C))
+ fn (log2 (fn / 2U))
+ (U fn) (log2 (1  fn / 2U)) 8! ! !*Ripper*Finding best condition is done by trying all possible split points (timeconsuming)
Shortcut: Linear Discriminant Analysis
Split point is calculated analytically
To be more robust
Instances further than 3 s are removed
If number of examples < 20, shortcut not used8U<Experiments29 datasets from UCI repository are used
10 fold crossvalidation
Comparison done using onesided t test
Comparison of three algorithms C4.5Rules, Ripper, Ripper*
Comparison based on
Error Rate
Complexity of the rulesets
Learning TimeDZ4Z
4!Error Rate (I)Ripper and its variant have better performance than C4.5Rules
Ripper* has similar performance compared to Ripper
C4.5Rules has advantage when the number of rules are small (Exhaustive Search)6D
z
Error Rate (II)"Ruleset Complexity (I)Ripper and Ripper* produce significantly small number of rules compared to C4.5Rules
C4.5Rules starts with an unpruned tree, which is a large amount of rule to start6
Ruleset Complexity (II)#Learning Time (I)lRipper* better than Ripper, which is better than C4.5Rules
C4.5Rules O(N3)
Ripper O(Nlog2N)
Ripper* O(NlogN)m
>
l Learning Time (II)$
Conclusion
Comparison of two rule induction algorithms C4.5Rules and Ripper
Proposed a shortcut in learning conditions using LDA (Ripper*)
Ripper is better than C4.5Rules
Ripper* improves learning time of Ripper without decreasing its performance@}
(
E
0(
x
c$2 U
2
x
c$2
2
H
0h ? xiPfr}OA333r9"3 5'*(
Z
F2yildizol@cmpe.boun.edu.tr/00
՜.+,D՜.+,\
Onscreen Showns] Times New RomanTahoma
WingdingsSymbolBlendsLearning Rules from DataRule Induction?RulesHow to generate rules?C4.5Rules (Quinlan, 93)
C4.5RulesRIPPER (Cohen, 95)PowerPoint PresentationRipper (Optimization)Minimum Description LengthRipper*ExperimentsError Rate (I)Error Rate (II)Ruleset Complexity (I)Ruleset Complexity (II)Learning Time (I)Learning Time (II)ConclusionFonts UsedDesign Template
Slide Titles 8@_PID_HLINKSA!mailto:yildizol@cmpe.boun.edu.tr _9BogaziciBogazicir than 3 s are removed
If number of examples < 20, shortcut not used8U<Experiments29 datasets from UCI repository are used
10 fold crossvalidation
Comparison done using onesided t test
Comparison of three algorithms C4.5Rules, Ripper, Ripper*
Comparison based on
Error Rate
Complexity of the rulesets
Learning TimeDZ4Z
4!Error Rate (I)Ripper and its variant have better performance than C4.5Rules
Ripper* has similar performance compared to Ripper
C4.5Rules has advantage when the number of rules are small (Exhaustive Search)6D
z
Error Rate (II)"Ruleset Complexity (I)Ripper and Ripper* produce significantly small number of rules compared to C4.5Rules
C4.5Rules starts with an unpruned tree, which is a large amount of rule to start6
Ruleset Complexity (II)#Learning Time (I)lRipper* better than Ripper, which is better than C4.5Rules
C4.5Rules O(N3)
Ripper O(Nlog2N)
Ripper* O(NlogN)m
>
l Learning Time (II)$
Conclusion
Comparison of two rule induction algorithms C4.5Rules and Ripper
Proposed a shortcut in learning conditions using LDA (Ripper*)
Ripper is better than C4.5Rules
Ripper* improves learning time of Ripper without decreasing its performance@}
(
ErE5!5`'*(
Z
F2yildizol@cmpe.boun.edu.tr/00DTimes New Roman5t\ )0tY0DTahomaew Roman5t\ )0tY0" DWingdingsRoman5t\ )0tY00DSymbolgsRoman5t\ )0tY0a.
@n?" dd@ @@``P)!"%")U
++0e0eA@A5%8c8c
?1d0u0@Ty2 NP'p<'p@A)BCDE?0@8%&7uʚ;2Nʚ;g4CdCd )0h
ppp@<4!d!d0$5<4dddd0$5?%!Learning Rules from DataOlcay Taner Y1ld1z, Ethem Alpayd1n
yildizol@cmpe.boun.edu.tr
Department of Computer Engineering
Boazii University2#7
0#<Rule Induction?oDerive meaningful rules from data
Mainly used in classification problems
Attribute types (Continuous, Discrete)Vp1oRulesDisjunction: conjunctions are binded via OR's
Conjunction: propositions are binded via AND's
Proposition: relation on an attribute
Attribute is Continuous (defines a subinterval)
Attribute is Discrete (= one of the values of the attribute)6ml How to generate rules?GRule Induction Techniques
Via Trees
C4.5Rules
Directly from Data
Ripperb
GC4.5Rules (Quinlan, 93)}Create decision tree using C4.5
Convert the decision tree to a ruleset by writing each path from root to the leaves as a rule~~}' C4.5Rules RIPPER (Cohen, 95)
6Split (pos, neg) into growset and pruneset
Rule := grow conjunction with growset
Add propositions one by one
IF (x1 > q1) AND (x2 > q2) AND (x2 < q3) THEN Y = OK
Rule := prune conjunction with pruneset
Remove propositions
IF (x1 > q1) AND (x2 < q3) THEN Y = OK
If MDL < Best MDL + 64
Add conjunction
Else
BreakRQZQZ(Z;ZZZZZQ
(
r P @Ripper (Optimization)>Repeat K times
For each rule
IF (x1 > q1) AND (x2 < q3) THEN Y = OK
Generate revisionrule by adding propositions
IF (x1 > q1) AND (x2 < q3) AND (x1 > q3) THEN Y = OK
Generate replacementrule by regrowing
IF (x1 > q4) THEN Y = OK
Compare current rule with revisionrule and replacementrule
Take the best according to MDL>"
2
+
[" A
: hMinimum Description LengthTDescription Length of a Ruleset
Description Length of Rules
S = k + k log2 (n / k) + (n k) log2 (k / (n k))
Description Length of Exceptions
S = log2 (D + 1)
+ fp (log2 (e / 2C))
+ (C fp) (log2 (1  e / 2C))
+ fn (log2 (fn / 2U))
+ (U fn) (log2 (1  fn / 2U)) 8! ! !*Ripper*Finding best condition is done by trying all possible split points (timeconsuming)
Shortcut: Linear Discriminant Analysis
Split point is calculated analytically
To be more robust
Instances further than 3 s are removed
If number of examples < 20, shortcut not used8U<Experiments29 datasets from UCI repository are used
10 fold crossvalidation
Comparison done using onesided t test
Comparison of three algorithms C4.5Rules, Ripper, Ripper*
Comparison based on
Error Rate
Complexity of the rulesets
Learning TimeDZ4Z
4!Error Rate (I)Ripper and its variant have better performance than C4.5Rules
Ripper* has similar performance compared to Ripper
C4.5Rules has advantage when the number of rules are small (Exhaustive Search)6D
z
Error Rate (II)"Ruleset Complexity (I)Ripper and Ripper* produce significantly small number of rules compared to C4.5Rules
C4.5Rules starts with an unpruned tree, which is a large amount of rule to start6
Ruleset Complexity (II)#Learning Time (I)lRipper* better than Ripper, which is better than C4.5Rules
C4.5Rules O(N3)
Ripper O(Nlog2N)
Ripper* O(NlogN)m
>
l Learning Time (II)$
Conclusion
Comparison of two rule induction algorithms C4.5Rules and Ripper
Proposed a shortcut in learning conditions using LDA (Ripper*)
Ripper is better than C4.5Rules
Ripper* improves learning time of Ripper without decreasing its performance@}
(
ErQ``)'