A Robust Control Framework for Malware Filtering

1 A Rob ust Control Frame work for Malw are Filteri ng Michael Bloem, T ansu Alpcan, Member , IEEE, and T amer Bas ¸ar , F ellow , IEEE Abstract — W e study and dev elop a ro bust contr ol framewo rk fo r malware ﬁltering and network security . W e in vestigate the malware ﬁltering p roblem by captu ring the tradeoff b etween increased security on one hand and continu ed usabi lity of the network on the other . W e analyze the problem using a linear control system model with a quadratic cost structure and develop algorithms based on H ∞ -optimal control th eory . A d ynamic feedback ﬁlter is d eriv ed and shown via numerical analysis to be an improv ement ov er v arious heuristic approaches to malware ﬁltering. The r esults are ver iﬁed and demonstrated with packet lev el simulations on the Ns-2 network simulator . Index T erms — Network security , in vasiv e software (malware) ﬁltering, contro l theory , H ∞ -optimal contro l. I . I N T R O D U C T I O N A TT ACKS on co mputer networks, such as worm or denial of services attacks, are difﬁcult to prevent in part due to the challen ge o f detecting and stopping them while still allowing legitimate network usage. Recent experience with Internet worm attacks makes th is point more clear: w ithin 10 minutes the Slammer worm had infected 90% of v ulnerable computer s in 2003 and the C ode Red virus inf ected h undred s of thousand s of hosts in 20 01 [1] , [2]. T he base-rate falla cy captures the essence of this p roblem. Even if we have low false-negati ve and false-positiv e rates in our detectio n of malware, there is so m uch mo re legitimate n etwork usage th an illegitimate u sage that we end up with many false alarms [3 ]. The incr edible variety in legitimate n etwork trafﬁc makes accurately differentiating it from malicio us trafﬁc even m ore challengin g. A more d etailed an alysis of the detection o f a particular type of worm epidemic in [4] shows the challenge of detecting some worm attack s ev en unde r id ealized condition s. In this speciﬁc case the base-rate fallacy again comes into play , as “a substantial volume of ‘back groun d radiation’ ” is to blame for making the detection of ran dom co nstant scan ning w orms difﬁcult. In trusion d etection sy stems must b e co nstructed with this dilemma in mind , and thus need to b e conservativ e in their operation . According to Federal Bureau of In vestigatio n (FBI) statis- tics, 70% of security problems o riginate within an organi- zation, an d 20% of respon dents to an FBI survey indicated that intru ders had bro ken into or attempted to b reak into their corporate network s in the last 12 months [5]. There fore, dynamic ﬁrew alls such as the Cisco Inter network Operating M. Bl oem is with the N ASA Ames Rese arch Center , Moffet t Field, CA 94035, USA (e-mail: michael.bloe m@nasa.gov). He was with the Unive rsity of Illino is at Urbana-Champaign and partly supported by Deutsche T elekom A G Laboratories during this project. T . Bas ¸ar is with the Coordi nated Science Laboratory , Univ ersity of Illinois at Urbana-Champaig n, Urbana, IL 61801 USA (e-mail : tbasar@c ontrol.csl.uiuc .edu). T . Alpc an is with Deutsche T elekom Labora tories, D-10587 Be rlin, Ger- many (e-mail: tansu.alpc an@telek om.de). System (IOS) ﬁrewall are an importan t form o f in ternal network security [5 ]. Our aim is to de velop algorithms and policies f or such ( re)conﬁg urable ﬁrew alls in o rder to ﬁlter malware trafﬁc such as worms, viruses, sp am, and T rojan horses. W e use H ∞ -optimal contro l theo ry to determine how to dynamica lly ch ange ﬁltering rules or parameters in order to ensure a certain perfor mance le vel. W e n ote that in H ∞ - optimal co ntrol, by v iewing the disturban ce as an intelligent maximizing oppon ent in a d ynamic zero-sum game, who plays with k nowledge of the m inimizer’ s co ntrol action, one ev aluates the system under the worst p ossible cond itions. This approa ch applies n aturally to the pro blem of malware response because the trafﬁc deviation resultin g fr om a mal ware attack is not mere ly random no ise, but represents th e ef forts of an intelligent attacker . Theref ore, we d etermine th e contr ol action that will minimize co sts under these worst circ umstances [6]. The r esulting conservati ve con troller w orks well even in light of the b ase-rate fallacy problem. T o the best of o ur knowledge, this work represents th e ﬁrst application of robust con trol theory to the problem of malware ﬁltering. A. Related w ork There are several method s of dynamic p acket ﬁltering [7]. Perhaps the most common one is to dyna mically change which ports are open o r closed. Statefu l inspectio n of dee per layers of packets allows fo r even more detailed ﬁltering by creating and maintainin g information abou t the state o f a cu rrent connectio n [5]. Ano ther po ssibility is to dyna mically alter the set o f Internet Proto col ( IP) ad dresses from which trafﬁc will be accepted [8]. An accu rate attack packet discard ing scheme ba sed on statistical processing has been proposed in [9], where each packet is associated with a score that reﬂects its legitimacy . Once th e score of a packet is co mputed, this scheme perfo rms score-based selective packet discarding where the dropping threshold is d ynamically adjusted based on the sco re distribution of rece nt incomin g packets and the current lev el of system overload. Implicit to the network trafﬁc ﬁltering prob lem considered in th is ar ticle is the partitio ning of a compute r network into various sub-networks for administrative and security p urposes. This app roach is common , and a separate ﬁrewall is o ften assigned to ea ch sub-network. Zou et al. [10] have prop osed a “Firewall Network Sy stem” based on this very co ncept. Cisco recomm ends their IOS ﬁrew alls for defendin g particular sub-networks or LANs in a corp orate network [5 ]. I n [ 11], quaran tining these su b-networks is con sidered as a strate gy to slow the spread o f worm epid emics. W e note th at althoug h the algorith ms d ev eloped in this p aper can be helpf ul f or con- ﬁguring dynamic ﬁrewalls such as the ones described ab ove, 2 our ma in objective is to develop mathematical foundation s and algorithm s for future security systems which will be e ven more conﬁgur able an d ﬂexible. Fin ally , while we consider the case of ﬁltering packets, these techniques could also be applied to ﬁltering connectio ns. The remain der of this article is structured a s follows: Sec- tion I I discusses the prob lem of ﬁltering network trafﬁc with dynamic ﬁrew alls sep arating sub-networks. W e next deriv e the H ∞ -optimal contr oller and state estimator in Section III. Section I V r evie ws M atlab and Ns-2 simu lations of the H ∞ -optimal contro ller an d demo nstrates its perfor mance in compariso n with o ther controller s. Conc luding remarks and directions for future research are presen ted in Section V. I I . N E T W O R K T R A FFI C F I LT E R I N G M O D E L In this section we present a linear system model for malw are trafﬁc a nd study the prob lem o f ﬁltering network trafﬁc to prevent mal ware pro pagation. Consider a compu ter network under the control o f a single adm inistrative unit, such as a co rporate network . Assume th e n etwork is di vided into sub-networks for ad ministrative and security p urposes [5]. While we will describe the mo del within this co ntext, the correspo nding control fr amew ork can be applied to oth er contexts by redeﬁning the entities in question. Let x ( t ) re present the n umber of m alware packets that trav erse a link on their way to th e destination sub -network at time t originating from infected sou rces outside the sub- network. W e mod el this malware ﬂow to th e sub-network using a line ar differential equation with control and disturb ance terms: ˙ x ( t ) = a x ( t ) + b u ( t ) + w a ( t ) , (1) where u ( t ) repre sents the number of packets that are ﬁltered at a particular time ( t ) . Usually , on ly some p ropo rtion of the pa ckets ﬁltered are actually malware related. T hus, th e parameter b corr esponds to that propor tion multiplied by − 1 . I n other word s, (1 − b ) is the prop ortion of ﬁltered packets that are n ot malware related . On the other han d, w a ( t ) represents th e n umber of malware p ackets ad ded to the link at time t in tentionally b y malicious sources or unintentio nally by hidden software r unning on hosts, bo th located outside the sub-network considered. Thus u ( t ) and w a ( t ) represent, for this sp eciﬁc sub-network, the packet ﬁltering rate and malware inﬁltration rate, respectively . Th e a value represents the instantaneous pro portion of malware packets on the link that are actu ally delivered to the sub-n etwork and is thus a negativ e n umber . Expand ing the dimen sions of the model in (1) leads to a set of linear differential equations: ˙ x ( t ) = A x ( t ) + B u ( t ) + D w a ( t ) , (2) where w a is d eﬁned as the vector of m alware packets. In this case bo th A and B are obtained simp ly by multiply ing the identity matrix b y a and b , respectively . The D m atrix imposes a pr opagation mod el on the attack and qu antiﬁes how malware is routed an d distributed on this n etwork. F or the purposes of this paper, it has zero s for its diago nal terms (in tra-sub- network m alware tr afﬁc does not lea ve the su b-network), and each colu mn mu st sum to 1 to ensur e conservation of packets. In this version of the pro blem, th e malware b eing sent to sub- network i is a fun ction of w j for j 6 = i , the malicious trafﬁc generated by oth er su b-networks. This assumptio n on the propag ation of malware inherent to the form given t o D allows for a centralized ﬁltering solution that c onsiders network- wide co nditions. A decentralize d version to this pro blem is also possible, howe ver . Overall, th is model simpliﬁes actual network dynamics by assuming a linear system and using a ﬂuid approx imation of trafﬁc ﬂow . Let us d enote by y ( t ) our m easurement of the n umber of inboun d malicious packets prior to ﬁltering. Note that the separation between detection ( y ( t ) ) a nd response ( u ( t ) ) is only at the concep tual le vel. In the implementation both may occur on the same d evice. Inaccuracies in y ( t ) are in evitable due to the challenging problem of distinguishing malicious packets from legitimate ones [3]. T o capture this un certainty formally , we d eﬁne y ( t ) as y ( t ) := C x ( t ) + E w n ( t ) , (3) where w n ( t ) is measureme nt noise of any fo rm. Later , we derive and ap ply the worst-case m easurement noise w n ( t ) . Additionally , we d eﬁne N := E E T and assume that it is positive deﬁnite, meanin g that the measur ement noise imp acts each dim ension o f th e me asured o utput. The C matrix m odels the assum ption that y ( t ) is higher than and propor tional to x ( t ) . When imp lemented, e ntries o f this constant matr ix could be measur ed f rom an analysis o f packet ﬁlterin g and the calculations req uired for d etermining the optimal contro ller could be rerun period ically . Note that we do not make any assum ption on how y ( t ) is o btained. It could be the resu lt of some statistical analysis comparin g the expected traf ﬁc to the measured traf ﬁc or be based on a set of rules wher e packets with certain character- istics are assumed to be malicious. Similarly , w a ( t ) repr esents a worm attack, expr essed in terms of number of the malware packets sen t fro m a sub- network to other sub -networks at each time instant. More precisely , it is the g enerated mal ware trafﬁ c ﬂo w rate in terms of packets per time step. For example, if a worm is very rapidly con tacting new ho sts and sending them packets, then w a ( t ) would be large. Howe ver , we do no t assume any form o n the attack. T o simp lify notatio n, we assume that the measuremen t no ise and attack disturbance ar e both part o f the vector w :=  w T a w T n  T . The model at h and contains several simp liﬁcations and assumptions. As was men tioned ear lier , the c ompon ents of the B matrix are set to be constants, although in reality the value of these components should change as x decreases, as there are less malicious p ackets to be ﬁltered, and we are ﬁltering packets we are less sure abo ut. This quantity also depends on the amo unt of legitimate network trafﬁc o n the link: if ther e is a relatively large amo unt of legitimate network tr afﬁc the n we will incur mo re f alse-po siti ves and thus end u p ﬁltering m ore legitimate trafﬁc. The B matrix is related to the false-negative and false-positive ratios, b ut it is mostly determin ed by the ratio of legitimate to illegitimate trafﬁc as describ ed in [3]. The exp onential decay in the nu mber of malware packets o n 3 the link (in th e absenc e of con trol and disturban ce) does not exactly capture network dynamics, but with a hig h enou gh rate of e xpo nential decay , this assum ption is q uite realistic when capacity co nstraints are no t s igniﬁcan t. Th e a ssumption of a con stant value for the C matrix is also an ap proxima tion, as in reality the number of m alware packets prior to ﬁltering will pr obably not b e linearly depend ent upo n th e n umber a fter ﬁltering. T o summ arize, this model simp liﬁes actual n etwork perfor mance by ass umin g linear dynamics. Moreover , th is model simpliﬁes system dyn amics by using a ﬂuid appro ximation of trafﬁc ﬂow . More speciﬁcally , this model only a pprox imately ca ptures the fact that, in an actual implementatio n, the nu mber of malware p ackets m easured prior to ﬁltering d iffers from the o ne that arrives at the sub- network in the number of the ﬁltered. Similarly , in or der to simplify the following calcu lations, we are app roxima ting a clearly d iscrete and ev ent-dr iv en system (a comp uter network ) with a co ntinuou s time system. Th is assump tion shou ld ho ld when we co nsider the rap idity and frequ ency of pac ket arri vals and transmissions along with the ﬁne-g rained time in crements of a compu ter network. I I I . D E R I V AT I O N O F O P T I M A L C O N T RO L L E R A N D S TA T E E S T I M ATO R Our objecti ve now is to design an algorithm o r co ntroller for trafﬁc ﬁltering giv en this imperfect measure of inb ound malicious pac kets. As p art of the H ∞ -optimal contr ol analysis and design we introd uce ﬁr st the contr olled output z ( t ) := H x ( t ) + G u ( t ) , (4) where we assum e that G T G is p ositiv e deﬁnite, an d that no cost is placed on the product o f co ntrol action s and states: H T G = 0 . H represen ts a co st on m alicious packets ar riving at a sub-network. A f ew other constraints that must be met for this H ∞ -optimal control theory to apply are that ( A, B ) and ( A, D ) be stabilizable, a nd ( A, H ) and ( A, C ) b e de tectable, and these cond itions readily h old in our case. If x b ecomes negati ve, we are ﬁltering le gitimate packets from the link. In other words, an equal penalty is assumed for underﬁlter ing and allo wing w orm -related trafﬁc on a link and also fo r overﬁltering and pre venting legitimate n etwork trafﬁc from tr av ersing the link . By weightin g these two quan tities equally , we are in effect encouraging surviv ability: ov erﬁlter- ing to prevent the spread of the worm b ut at the same time crippling the network is pen alized as muc h as allowing the worm-related traf ﬁc to run ramp ant. The cost on ﬁltering le gitimate traf ﬁc is actu ally m ore complicated than in dicated above. Recall that b speciﬁes the propo rtion of ﬁltered tr afﬁc that is malware-related. Thus, (1 − b ) is the propor tion of ﬁltered trafﬁc that is legitimate (assuming x is positi ve). If we ass ign a cost of f l to ﬁltering legitimate packets when malware packets are on the link and a cost of f a to the ﬁltering action itself, the compo nents g of G can be speciﬁed as g = f l (1 − b ) + f a . The cost of this system for the purpo se of H ∞ analysis is deﬁned by L ( x , u , w ) = k z k k w k , (5) where k z k 2 := R ∞ −∞ | z ( t ) | 2 dt and a similar deﬁnitio n applies to k w k 2 . This is a cost ratio r ather than an actu al cost, but we will refer to it as the co st for simplicity . It captures the proportion al changes in z du e to changes in w . More intuitively , it is the ratio of the cost incurred by the system to the correspon ding attacker and measur ement noise “ef fort’ . ’ There are a fe w assump tions and simpliﬁcations present in this cost structure . W e assign a cost to the malware p ackets, not the infected an d disabled ho sts or servers themselves, which are the often actually where the co sts of malw are oc cur . On the o ther han d, malware trafﬁc itself can do minate network re- sources and thus be costly in its own right. Another ass ump tion is that we a ssign costs to traf ﬁc incoming to a sub- network ev en if that su b-network is alr eady in fected, in which case the incoming malicious tr afﬁc would b e unimpor tant. In spite of these two assum ptions, this cost structure cap tures m ost o f the importan t ch aracteristics of malware p acket propagation . H ∞ -optimal control theory not on ly applies very directly and app ropriately to the p roblem of worm response, but a lso guaran tees that a perf ormance factor (th e H ∞ norm) will be met. This no rm can be thought o f as the w orst possible v alue for the cost L and is bou nded above by γ ∗ := inf u sup w L ( u , w ) , (6) which can also be viewed as the optimal perf ormance level in this H ∞ context. In ord er to actually solve for the optimal controller µ ( y ) , the num ber of packets to ﬁlter as a fu nction of the in accurately measured num ber of inb ound malicious p ackets, a correspo nd- ing d ifferential game is deﬁned between th e attackers an d the malware ﬁlterin g system, which is para meterized by γ , wher e γ > γ ∗ : J γ ( u , w ) = k z k 2 − γ 2 k w k 2 . (7) The m alicious attackers try to ma ximize th is co st fu nction in the worst-case by varying w while the malware ﬁltering algorithm minimizes it via the contro ller u . A similar appli- cation of game theory , where attackers an d in trusion detec- tion/prevention sy stem are modeled as play ers in a security game, has been in vestigated in [1 2]. The optimal ﬁltering strategy u = µ γ ( y ) can be determ ined from th is differential g ame form ulation for any γ > γ ∗ . It is giv en by [6] µ γ ( y ) = − ( G T G ) − 1 B T ¯ Z γ ˆ x , (8) where ¯ Z γ is solved from A T Z + Z A − Z ( B ( G T G ) − 1 B T − γ − 2 D D T ) Z + H T H = 0 , (9) as its unique min imal p ositi ve deﬁn ite so lution, and ˆ x is given by ˙ ˆ x =  A − ( B ( G T G ) − 1 B T − γ − 2 D D T ) ¯ Z γ  ˆ x +  I − γ − 2 ¯ Σ γ ¯ Z γ  − 1 ¯ Σ γ C T N − 1 ( y − C ˆ x ) , (10) where ¯ Σ γ is the uniqu e minimal positi ve deﬁnite solution of A Σ + Σ A T − Σ( C T N − 1 C − γ − 2 H T H )Σ + D D T = 0 . (11) 4 Fig. 1 S A M P L E C O M P U T E R N E T W O R K T O B E A NA LY Z E D . Here ˆ x is an estimate fo r x . This is a linear feedback controller operating on a state estimate. Furth er , γ ∗ is the smallest γ such that ρ ( ¯ Σ γ ¯ Z γ ) < γ 2 , where ρ (Λ) denotes the spectral radius of the matrix Λ . The onlin e calculation is simply a multiplication by the estimate o f the system st ate. Also note that this controller requires a n etwork-wide kn owledge of the system state estimate and th us this is a centralized control solution. There ar e a fe w assum ptions implicit in this sp eciﬁc con - troller formation. The v arious ﬁlters will hav e to send control packets to eac h other, indicating their y v alues. Mo reover , it is assumed that these ﬁlters are able to con vert a n umber of packets to ﬁlter per time step ( u ( t ) ) into a ﬁltering ru le that will implement that ﬁltering rate. The packets that are most likely to be maliciou s sh ould be ﬁlter ed ﬁrst. Exactly how this is done depen ds on the system imple mentation. F or example, a ru le-based ﬁlter c ould imp lement mo re rules (b lock mo re ports o r IP addresses) or the sensitivity of a n anomaly-b ased detector could be increased when u ( t ) increa ses. Remark III.1. The H ∞ -optimal co ntr oller derived her e (8) is a centralized co ntr o l solu tion due to th e D matrix, whic h imposes a speciﬁc malwar e pr opagatio n model. However , we can app ly the same framework to ea ch sub-network separately by using (1 ) for each. This lea ds to a d ecentralized solution consisting of independ ent sca lar H ∞ -optimal contr ollers. I V . S I M U L A T I O N S Consider th e represen tativ e c omputer n etwork shown in Fig. 1. In this simple n etwork co nﬁguratio n, each su b-network or LAN has a dy namic ﬁrew all th at ﬁlters incom ing network trafﬁc. Each ﬁrew all commun icates its malicious p acket m ea- sure y to all other ﬁrew alls, where ﬁltering decisions are made. No centralized server is overseeing the ﬁltering activity . A. Simulation setup Sev eral attack typ es are simulated in Matlab on this network topolog y in order to compare the H ∞ -optimal con troller with other controllers. As a simpliﬁcation, a sub-network is assumed to be eith er inf ected or not infe cted. An in fected sub- network sen ds malware to other sub-network s. Sub-networks become infected with some prob ability once they have r e- ceiv ed a certain threshold number of malware packets. This probab ility incr eases wh en higher thresholds are met. Clear ly the p ropaga tion of these ﬁctitiou s attacks is muc h simpler th an that of an a ctual worm or v irus, but it captur es the under lying dynamics of an attack. Four typ es of malw are attacks are considered : no attac k (A1); a hig h-trafﬁc, slow spread ing attack (A2) ; a low-trafﬁc, slow-spreading a ttack (A3); and a low-traf ﬁc, fast-spreading attack (A4). In each o f these attacks, one subnetwork is initially infec ted and sends malware to all o ther sub -networks. Fi ve respo nse types a re applied to each of these fou r attack types: no response (R1), the H ∞ -optimal controller response (R2), a th reshold- based controller that im plements a ﬁlter of some ﬁxed magnitude when a ce rtain am ount o f malicious packets are d etected (R3), a controller th at removes all suspicio us pac kets ( y ( t ) ) from ea ch link (R4), an d an optimal controller that min imizes the cost k z k 2 (R5). For the linear quadratic Gaussian (LQG) optimization problem in (R5), which is ob tained as th e limit o f the H ∞ problem as γ → ∞ , we use the expected value of R ∞ −∞ k z k 2 dt as th e quadra tic cost, wh ich we again de note by k z k 2 by a slight abuse o f notation. A few details relating to the nu merical analysis of th ese controller s will now be giv en. The A matr ix is set to be the identity matrix multiplied by -1. Recall that this value quantiﬁes the e xpon ential d ecay of malicious pa ckets on the link as they arri ve at their destination sub-network. The b quantity is set to 0.5. T his value is con sistent with a d etection rate (true-po siti ve rate) of 0.7 and a very low ( 10 − 5 ) false- positive rate – a scenario consider ed in [ 3]. The D matrix is set u p such th at su b-networks are mo re likely to transfer th e worm with in th eir grou p of three su b-networks. The C matrix is set to be 2 multiplied by the identity matrix, which is deri ved from v alues o bserved in the Ns-2 simulations to be explained in Section IV -C. It is assumed that w n has a positive mean , as m ost malware detec tion sch emes are set up to, if anything, overestimate the number of ma licious packets. The standard deviation of w n is re lati vely low . Also, the noise is assumed to be white Gaussian n oise, althoug h in reality this noise ma y well have some auto correlation . Simulations are run with three sets of cost function s ( k z k 2 and L ) that differ in th eir co efﬁcients. T he r atio between th e cost on inboun d malware packets x an d the cost on ﬁltering packets u (which inv olves a cost on ﬁlterin g legitimate packets and also the ﬁltering cost itself) is set at 10:1, 100:1, and 1000:1 . B. Matlab simulations W e ﬁrst c onduct a n umerical analysis in Matlab. Th e simulations where no respo nse is applied demon strate that the assumed m alware packet propagation ru les m imic the “S- shaped” b ehavior of worm o r v irus p ropagatio n fairly well [2 ]. Note that in Fig. 2 the numb er of malware packets arriving 5 0 20 40 60 80 100 0 20 40 60 80 100 120 Malware Sending Rate Packets per time step Time 0 20 40 60 80 100 −50 0 50 100 150 200 250 300 Measurement of Inbound Malware Packets Number of Packets Time 0 20 40 60 80 100 −1 −0.5 0 0.5 1 Filtering Rate Packets Filtered per time step Time 0 20 40 60 80 100 0 20 40 60 80 100 120 Inbound Malware Packets Number of Packets Time Fig. 2 N U M E R I C A L A N A LY S I S O F S L O W W O R M A T TAC K W I T H N O R E S P O N S E A P P L I E D O N T W O ( O U T O F 9 ) S U B - N E T W O R K S . T ABLE I C O S T R AT I O S ( L ) O F C O N T R O L L E R S U N D E R V A R I O U S A TTAC K S ( b = 0 . 5 ) Attack R1 R2 R3 R4 R5 A1 0.00 3.48 0.00 2.35 2.04 A2 8.36 3.00 8.02 4.45 5.42 A3 9.07 2.88 5.76 4.42 4.71 A4 9.42 2.90 5.31 4.49 5.15 at the two grap hed sub-networks starts small whe n only one sub-network is initially i nfec ted. As the w orm or virus spreads, the number of in bound malware p ackets inc reases rapidly f or a per iod b ut e ventually levels off when more an d mor e su b- networks become infected. The H ∞ -optimal controller perfo rms better than e very other controller whene ver malware is p resent, as seen in T able I. In this case, we cho ose a 1 00:1 m alware packet to ﬁltering actio n cost ratio. The resulting γ ∗ is 4.52 . T able I I shows the actua l c osts in curred by the system in each scenario with the same co st structure. The s igniﬁcan tly lower cost v alues for the H ∞ -optimal controller in the face of attacks highlight its ability to ﬁlter enough to p revent sub- networks from becoming infected. The preventativ e ability of th e H ∞ -optimal contr oller ca n T ABLE II C O S T S ( k z k 2 ) O F C O N T R O L L E R S U N D E R V AR I O U S A T TAC K S ( b = 0 . 5 ) ( × 10 3 ) Attack R1 R2 R3 R4 R5 A1 0 1.172 0 0.788 0.682 A2 105.4 18.24 94.08 46.85 88.24 A3 22.68 5.579 16.77 12.50 10.34 A4 27.97 4.979 13.51 12.63 14.24 0 20 40 60 80 100 0 20 40 60 80 100 120 140 Malware Sending Rate Packets per time step Time 0 20 40 60 80 100 −5 0 5 10 15 Estimate of Inbound Malware Packets Number of Packets Time 0 20 40 60 80 100 −20 0 20 40 60 80 100 120 Filtering Rate Packets Filtered per time step Time 0 20 40 60 80 100 −20 0 20 40 60 Inbound Malware Packets Number of Packets Time Fig. 3 N U M E R I C A L A N A LY S I S O F S L O W W O R M AT TAC K W I T H H ∞ R E S P O N S E A P P L I E D O N T W O S U B - N E T W O R K S . also be observed in Fig. 3. As soon as the ﬁrst n etwork detects an inc rease in inbou nd m alware packets shor tly after 10 time units, the controller begins ﬁltering signiﬁcantly (see Fig. 3 “Filtering Rate”) all a cross the n etwork. This pr ev ents the seco nd sub-network from becoming infe cted. W e indeed observe that it never sends malw are packets in Fig. 3 “Mal ware Sending Rate. ” The ability of the centralized H ∞ -optimal controller to respond network-wide to a n attack, an d h ence, incr ease ﬁl- tering rates sign iﬁcantly even on sub-networks wh ere there are not y et m any malware pac kets being detected, provides an advantage over other contro llers. Another adv antage is that it tends to ﬁlter packets aggressively (see Fig. 3 ). W e obser ve this robustness pro perty of the H ∞ -optimal controller in the “Filtering Rate” graph of Fig. 3, wher e the number of packets ﬁltered is h igher than th e numb er of inbo und malware packets. This also contributes to prev enting inf ections, decr easing cost to th e network ( k z k 2 ), an d to guaran teeing some level of perfor mance ( γ ). For com parison, Fig. 4 shows th e perfor mance of the controller that removes all the estimated m alware packets, thereby disregarding measu rement errors and network-wide condition s. While it d oes over-ﬁlter , it does n ot ﬁlter n etwork- wide when a sing le sub-network detects signiﬁcant numb ers of malware pac kets. Th us, th e u ninfected sub-ne twork eventually becomes infected at aroun d time step 25 , wh ich causes it to send malw are (Fig. 4). The LQR optimal controller (R5), on the other han d, doe s ﬁlter n etworkwide upon detec tion of inboun d malware packets anywhere in the network. It does not, howev er, ﬁlter as muc h as the H ∞ -optimal contro ller . Moreover , it is h indered in th at it assumes a zero -mean disturbanc e, an assump tion that becomes m ore inaccu rate as more sub-networks become infected. The H ∞ -optimal controller, o n the other hand , ten ds to incur relatively high costs and cost ratios when there are no infected sub-networks due to its n etwork-wide over-response 6 0 20 40 60 80 100 0 50 100 150 Malware Sending Rate Packets per time step Time 0 20 40 60 80 100 −50 0 50 100 150 200 Measurement of Inbound Malware Packets Number of Packets Time 0 20 40 60 80 100 −50 0 50 100 150 200 Filtering Rate Packets Filtered per time step Time 0 20 40 60 80 100 −20 0 20 40 60 80 Inbound Malware Packets Number of Packets Time Fig. 4 N U M E R I C A L A N A LY S I S O F S L O W W O R M A T TAC K W I T H T H E C O N T R O L L E R T H AT R E M OV E S A S M A N Y M A LWAR E PAC K E T S A S I T M E A S U R E S O N T W O S U B - N E T W O R K S . 0 20 40 60 80 100 −1 −0.5 0 0.5 1 Malware Sending Rate Packets per time step Time 0 20 40 60 80 100 −2 0 2 4 6 Estimate of Inbound Malware Packets Number of Packets Time 0 20 40 60 80 100 −20 −10 0 10 20 30 40 50 Filtering Rate Packets Filtered per time step Time 0 20 40 60 80 100 −6 −5 −4 −3 −2 −1 0 Inbound Malware Packets Number of Packets Time Fig. 5 N E T W O R K M O D E L R E S P O N S E T O N O I N F E C T I O N S W I T H T H E H ∞ - O P T I M A L C O N T RO L L E R . (refer to T ables I a nd II). The very character istics tha t make it a strong controller in the face o f attack s pr ove co stly in the absence of attacks. In fact, the theoretical worst-case attack is actually quite sma ll in m agnitude and e ssentially max imizes L by takin g advantage of the tiny false alarms an d corr esponding excessi ve ﬁltering that inaccurate me asurements ind uce in the H ∞ -optimal controller . Figure 5 demonstrates this behavior . Note th at the negativ e number of inbou nd malware packets indicates that all m alware packets have been ﬁltered and legitimate traf ﬁc is being removed f rom the link. Simulations were also run for other cost functio ns. T he H ∞ - optimal controller performed r elativ ely better when there was a greater cost p ut o n the inbound ma lw are packets and vice versa. This is to be expected , as this controller is rewarded Fig. 6 S C R E E N S H O T O F T H E N S - 2 S I M U L ATO R O U T P U T . G R E E N PAC K E T S A R E L E G I T I M ATE A N D R E D PA C K E T S A R E M A LWAR E . more for being cautiou s when the inbound malw are packets increase in cost. When th e b value was decreased fr om 0 .5 to 0.3, the H ∞ -optimal con troller also perfo rms r elativ ely better . This decrease in b means that when ﬁltering d oes occur, we are less likely to actually ﬁlter a m alicious p acket, and thus controller s that ﬁlter mo re are rewarded. A decreased b cou ld result f rom a lo wer true-positi ve rate, a higher false-po siti ve rate, or a higher ratio of legitimate to malicious trafﬁc. C. Ns-2 Implementation W e simulate the trafﬁc contro l algorith m dev eloped at the packet le vel using the Ns-2 network simulator . Our goal is to further in vestigate the ch aracteristics of the designed H ∞ - optimal contro ller and demonstrate its capabilities in a realistic setting. T o enable compar isons with the numerical results obtained from Matlab simu lations we deﬁne in Ns-2 the same network top ology as in Sectio n I V, wh ich is depicted in Fig. 1. Depend ing on the speciﬁc app lication, the end nod es in th is g raph m ay repre sent a sub-network or an y lo gical o r physical set of hosts. As bef ore, we assume high capacity links between n odes such that no malware packet is dropp ed d ue to congestion , cor respond ing to a worst-case scenario. In order to simulate the ﬁltering alg orithm, we consider here a speciﬁc two-part impleme ntation consisting of monito ring and ﬁ ltering elemen ts. The m onitoring n odes, depicte d as hexagons in Fig. 6, associate a malw are scor e s ∈ [0 , 99] to each indi vidu al packet passing through the lin k from the o ut- side. As a simpliﬁcation, we simulate only inbo und monitoring and ﬁltering. Howe ver, a symmetrical outboun d coun terpart of the sche me can easily be im plemented. The mo nitoring elements use this score s and a speciﬁc con stant threshold to make an initial estimate on the natur e of the packet and label it as malware or not. A cou nt of these observed malware packets gives y ( t ) . The monito ring no de m ay u tilize a ny set 7 of algorithms or appro aches to d etermine this quantity . W e generate the scores randomly according to dif ferent probab ility distributions for legitimate and malicious pa ckets and use a ﬁxed threshold to simulate this pro cess. This m ethod is similar in some ways to the scoring strategy proposed in [9]. The ﬁltering elemen ts d epicted as boxes in Fig . 6 ﬁrst fetch the ma lware sco re s an d the ﬂag from th e heade rs of inbound packets, an d then use either a heu ristic or a H ∞ controller-based algor ithm to make ﬁltering d ecisions. In this implem entation, the algorith ms decide on a time-varying threshold value (d ifferent th an the previous constan t measure- ment threshold) , resu lting in a dynamic ﬁltering schem e. The packets with a score higher tha n the thresh old are ﬁltered . For c omparison purposes, we simulate the R4 algorithm in Section IV -A, which we den ote as h euristic , in addition to the H ∞ algorithm . W e do not simu late any ﬁltering scheme with a time-in variant threshold as it clearly would und er p erform in a dynamic network environment wh en compare d with the dynamic threshold algorithm s. W e calculate the H ∞ -optimal co ntroller of ﬂine in Matlab and tran sfer the results to th e Ns-2 simulator . I n acc ordance with the model in Sectio n II, the resulting controller decides on the num ber of malw are packets to be ﬁltered at a g i ven time interval. W e translate this number into a thre shold value by periodica lly observing th e distribution of scores g enerated by th e mo nitoring elemen t. Hence, the threshold is chosen such that the n umber of packets with a score high er than the threshold (i.e., to b e ﬁltered) matches the num ber dictated by the H ∞ -optimal controller . Remark IV .1. It is imp ortant to note tha t the example Ns- 2 implementation we choose h er e do es not play a sign iﬁcant r ole for the analysis and demo nstration of our algo rithm. In fact, dep ending on the sp eciﬁc app lication at ha nd, one c an choose a variety of equiva lent implementations without loss of any generality . F or example, the monitoring and ﬁltering elements c an be parts of lar ger units e ach or combined within a dedicated physical d evice. Or the monitoring e lement can be deployed as a dedicated har dwar e device and the ﬁ ltering element as part o f a ﬁr ewall impleme ntation. Clearly , the possible combinatio ns ar e numer ous. W e simulate, comp are, and contr ast the H ∞ and detectio n- based heuristic ﬁltering schemes in a v ariety of scenarios under different cost structures, detection capabilities, and trafﬁc lev els. The hypoth etical scenarios we consider are summarized as follows: 1) A cost on mal ware packets ( x ) to cost on ﬁltering ( u ) ratio of 100 :1 in k z k 2 and L . W e assume th at the monitorin g devices are capable of scor ing an d lab eling only half of the malware packets correctly (S1). 2) The cost is th e same as in Scenario 1, but we co nsider a more p essimistic case wher e th e mo nitoring device on ly detects a quar ter of the to tal malware packets (S2). 3) This scenario is the same as Scenario 1 except for an increase in the cost coefﬁcient ratio to 200:1 (S3). 4) Likewis e, this scenario is the same as Scenar io 2 with a cost coefﬁcient r atio of 200:1 (S4). 5) The ﬁnal scenario matches Scenario 1 but has a cost coefﬁcient ra tio of 0.1:1 (S5). In all of the above cases, each end-no de (sub-network) sends random ly ﬂuctuating 1000 -KB legitimate trafﬁc to all sub- networks. In a ddition we consider an “infection ” or worm- like malware propagation scheme, where each sub- network becomes “infected” with som e probability if it recei ves suf- ﬁciently many malw are packets and afterward gen erates mal- ware traf ﬁc of 200 - KB to other nodes. T ABLE III C O S T R E S U LT S O F N S - 2 S I M U L ATI O N S H ∞ -Optimal Detec tion-Based Scen. L k z k 2 ( × 10 6 ) γ ∗ L k z k 2 ( × 10 6 ) S1 3.9 77 3.2 4.9 147 S2 3.7 89 3.2 6.6 369 S3 4.2 87 4.2 6.9 287 S4 4.9 155 4.2 9.3 736 S5 0. 31 0.68 0.3 1.05 6.78 The numer ical results for bo th of the algorithms under each scenario describe d above ar e summarized in T able II I. W e observe here several expected characteristics o f the H ∞ controller such as o ptimality with re spect to the cost func tions and rob ustness. In almost all of the cases and over a wide range of cost coef ﬁcient ratios it outperf orms th e detec tion-based heuristic scheme. More impo rtantly , it e xhibits robustness with respect to v ariation s in detection quality (see case 1 versus 2) and guarantees an upper boun d on th e cost L . It is observed that the L value is always near the theoretically calculated bound γ ∗ . Another indication of the H ∞ -optimal controller’ s robustness is the satisfactory perfo rmance of the controller ev en th ough it is calculated ofﬂine with estimate d system characteristics. This, along with th e assumptions inherent in the m odel, explains the occasional d iscrepancies o bserved between L values a nd the theoretica l u pper-bound s γ ∗ . 0 2 4 6 8 10 0 200 400 600 800 1000 1200 Time Flow Rate (packets per second) Inbound Packet Flow under Detection Filtering x y w 0 2 4 6 8 10 0 200 400 600 800 1000 1200 Time Flow Rate (packets per second) Inbound Packet Flow under Detection Filtering u m Fig. 7 V A R I O U S I N B O U N D PA C K E T FL O W R AT E S T O S U B - N E T W O R K 1 U N D E R T H E D E T E C T I O N - B A S E D FI LT E R I N G . W e next analyze the time-series data collected for a r epre- sentativ e sub-network. W e depict various quan tities of in terest x (malware packets that pass throu gh the ﬁlter), y (packets labeled as malware by monitor), and u (ﬁltering rate) as in Sub-section I V -B. In addition, we plo t the the rate of falsely positive labeled packets m and the rate of r eal malware ﬂow , w . Figur e 7 sho ws the ev olution of these quantities o ver time in Scenario 1 under the detection -based scheme, whereas Fig. 8 depicts th e co unterpar t for the H ∞ controller . W e o bserve that 8 0 2 4 6 8 10 0 200 400 600 800 1000 1200 Time Flow Rate (packets per second) Inbound Packet Flow under H− ∞ Filtering x y w 0 2 4 6 8 10 0 500 1000 1500 2000 2500 3000 3500 4000 Time Flow Rate (packets per second) Inbound Packet Flow under H− ∞ Filtering u m Fig. 8 V A R I O U S I N B O U N D PAC K E T FL OW R AT E S T O S U B - N E T W O R K 1 U N D E R H ∞ C O N T R O L L E R . the H ∞ controller performs b etter th an th e detection-based scheme in term s of removing the malware pa ckets through aggressive ﬁlterin g in line with the pr eferences exp ressed in the co st fun ction. Concurren tly , this leads to a slower infectio n rate as can be inf erred from the evolution of real malware ﬂow rate ( w ) in Fig. 8. O n the o ther h and, when the co st co efﬁcient ratio changes to the o ne in Scenario 5 , H ∞ controller is much less agg ressi ve in ﬁltering du e to high cost of dropping legitimate packets. This can be seen in Fig. 9, where the maximum ﬁlterin g ra te is signiﬁcantly lower that in other scenarios. 0 2 4 6 8 10 0 200 400 600 800 1000 1200 Time Flow Rate (packets per second) Inbound Packet Flow under H− ∞ Filtering x y w 0 2 4 6 8 10 0 10 20 30 40 50 60 70 Time Flow Rate (packets per second) Inbound Packet Flow under H− ∞ Filtering u m Fig. 9 V A R I O U S I N B O U N D PAC K E T FL OW R AT E S T O S U B - N E T W O R K 1 U N D E R H ∞ C O N T R O L L E R W H E N T H E C O S T C O E FFI C I E N T R AT I O I S 0 . 1 : 1 ( S C E N A R I O 5 ) . W e ﬁnally consider the case when one of the sub-networks (say 5 ) is more valuable th an others an d needs more in tensiv e inboun d ﬁltering. This p referen ce can easily be reﬂected to the cost functio n by increasing the respective entr y of the matrix H in (4). Thus, th e H ∞ controller reacts accor dingly and ﬁlters more a ggressively for this sub-n etwork compared to any other as depicted in Fig. 10. V . C O N C L U S I O N W e h av e studied an ap plication of robust control theory to network secu rity by in vestigating an H ∞ -optimal control formu lation of the network ﬁltering pr oblem that captu res its inherent challeng es such a s th e base- rate fallacy and takes into account r elev ant co sts. T he co rrespon ding H ∞ -optimal con- troller has been derived and a nalyzed n umerically in M atlab as well as simulated in Ns-2. Th e co ntroller perf orms better than a lternative contr ollers w hen th ere is a signiﬁcant amount of malware trafﬁc pr esent. In add ition, it provides a certain perfor mance guarantee fo r a wide range of condition s. 0 1 2 3 4 5 6 7 8 9 10 0 50 100 150 200 250 300 350 400 450 500 Time Flow Rate (packets per second) Inbound Malware Packet Flow under H− ∞ Filtering Subnetwork 1 Subnetwork 2 Fig. 10 I N B O U N D M A LWAR E PAC K E T FL O W R A T E S T O S U B - N E T W O R K S 1 A N D 5 ( M O R E V A L U A B L E ) U N D E R H ∞ C O N T R O L L E R . There exist several possible extensions to this work. Obtain - ing a distributed version of this co ntroller for a larger system could be o ne future directio n. Another research d irection is the application of similar H ∞ -optimal controllers to other network security proble ms, such as spam ﬁltering and DDoS attacks. A C K N O W L E D G M E N T The authors would like to thank the Boeing Corpora tion and Deutsche T elekom, A G for th eir support of this research, for the former through the Informatio n Trust Institute at the University of Illinois at Urb ana-Champ aign. An earlier, mo re concise v ersion of this paper will be presented at the 46th I EEE Conferenc e on Decision and Con trol, New Orlean s, Decemb er 12-14 , 20 07, with the title “ An optimal control approach to malware ﬁlt ering . ” R E F E R E N C E S [1] D. Moore, V . Paxson, S. Sav age, C. Shannon, S. Staniford, and N. W eav er , “Inside the slammer wo rm, ” IEEE Security & Privacy Magazi ne , vol. 1, pp. 33–39, July-Aug. 2003. [2] D. Moore, C. Shannon, and K. Claf fy , “Code-red: A case study on the spread and victims of an intern et worm, ” in Proc. of ACM SIGCOMM W orkshop on Internet Measur ement , Marseille , Fra nce, 2002, pp. 273– 284. [3] S. Axelsson, “The base-rate falla cy and its implica tions for the difﬁcult y of intrusion detection, ” in Pr oc. of 6th ACM Confer ence on Computer and Communicati ons Security , Kent Ridge Digital Labs, Singapore, 1999, pp. 1–7. [4] K. Rohloff and T . Bas ¸ar, “The detectio n of RCS wo rm e pidemics, ” in Pr oc. of ACM W orkshop on Rapid Malcode , Fairfax , V A, 2005, pp. 81– 86. [5] Cisc o, “NA T and stateful inspec tion in Cisco IOS ﬁre wall, ” white paper , 20 06. [Online]. A v ailable : http:/ /www .cisco.com/e n/US/tech/tk648/tk361/technologies white paper \ 09186a 008019 4 a f 8 . s h t m l [6] T . Bas ¸ ar and P . B ernhard, H ∞ -Optimal Contr ol and Related Minimax Design Proble ms: A Dynamic Game Appr oach , 2nd ed. Boston, MA: Birkh ¨ auser , 1995. [7] M. Tull och, Micr osoft Encyclope dia of Security . Redmond, W A: Microsoft Press, 2003. 9 [8] S. Hazelhurst, “ A proposal for dynamic access lists for TCP/IP packet ﬁlterin g, ” in Pr oc. of South African Instutut e of Computer Scientists and Informatio n T echnolo gists Annual Confer ence , Pretoria, South Africa, September 2001, http://arxi v .org/abs/ cs/0110013. [9] Y . Kim, W . C. Lau, M. C. Ch uah, , and H. J. Chao, “Packe tscore: A statisti cs-based pack et ﬁltering scheme ag ainst distrib uted den ial-of- service attacks, ” IEEE T rans. on Dependable and Secur e Computing , vol. 3, no. 2, pp. 141–155, April-June 2006. [10] C. Zou, D. T owsle y , and W . Gong, “ A ﬁrew all netwo rk system for worm defense in enterprise networks, ” Uni versity of Massechusetts, A mherst, MA, T echni cal Report TR-04-CSE-01, Feb. 2004. [11] T . M. Chen and N. Jamil, “Ef fecti veness of quarantine in worm epi- demics, ” in P r oc. of IEEE ICC 2006 , Istanbul , T urke y , June 2006, pp. 2142–2147. [12] T . Alpcan and T . Bas ¸ ar , “ A game theoretic analysis of intrusion detection in ac cess contr ol systems, ” in P r oc. 43rd IEEE Con f. Deci sion and Contr ol , Paradise Island, Bahamas, D ecember 2004, pp. 1568–1573.

A Robust Control Framework for Malware Filtering

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment