Solving package dependencies: from EDOS to Mancoosi
Mancoosi (Managing the Complexity of the Open Source Infrastructure) is an ongoing research project funded by the European Union for addressing some of the challenges related to the "upgrade problem" of interdependent software components of which Deb…
Authors: Ralf Treinen (PPS), Stefano Zacchiroli (PPS)
Solving P ac k age Dep endencies: F rom EDOS to Manco osi ∗ Ralf T reinen and Stefano Zacc hiroli Lab orato ir e Preuv es, Programmes et Syst ` emes Univ ersit ´ e P aris Diderot, P aris, F rance { treinen,zac k } @ { pps.jussieu.fr,debian.org } No v ember 6, 2018 Abstract Mancoosi (Managi n g the Complexit y of the Op en Source Infrastructure) is an ongoing researc h pro ject fun ded by the European Union for addressing some of the c h allenges related to th e “upgrade problem” of interdependent soft ware components of whic h Deb ian pack ages are prototypical examples. Mancoosi is th e natural con tinuation of the EDOS pro ject whic h has al- ready con tribu ted tools for distribution-wide qualit y assurance in Debian and other GNU/Linux distributions. The consortium b ehin d the pro ject consists of sev eral Europ ean pu blic and priv ate researc h institu tions as wel l as some commercial GNU/Linux distributions from Europe and S outh America. De- bian is represented by a small group of Debian Developers who are wo rk ing in the ranks of the in volv ed univ ersities to driv e and in tegrate back achiev ements into Debian. This pap er presents relev ant results from EDOS in dep enden cy manage- ment and giv es an o verview of t he Manco osi pro ject and its ob jectives, with a p articular fo cus on t h e prospective b enefits for Debian. 1 In tro d uction Building a nd maintaining a free softw ar e distribution is a c ha llenging task. A user exp ects to b e able to install any selection of pack ages from the distribution on his machine, and that the installatio n go es smo othly and results in a working system with the desired functionality . Any requir ement, for instance the need of installing certain auxiliary pa ck a ges from the distribution, should b e detected by the to ols coming with the distribution, and should b e satis fie d automatically whatever pac k - ages the user wishes to install. Inco mpatibilities in user wishes should b e detected and rep orted bac k to the user with a satisfying explana tio n. Softw are is ex pec ted to b e readily av aila ble in its latest version, of cours e well-tested without any bugs or an y remaining incompatibilities with other s oftw are co mpo nents. All this is ex- pec ted to work smo othly on a wide r ange of architectures and system co nfigurations. It is the task o f a pack age ma int a iner to do her b est to satisfy these exp ectations. Luckily , a ma intainer has at her disp osition a s o phisticated infrastructur e, a knowl- edge base of p olicie s and b est practices, and the suppo rt of her fellow developers. On the other hand the maintainer is also faced with upstrea m author s who us ua lly ∗ The research leading to these results has receive d funding fr om the Europ ean Comm unity’s Sev enth F ramework P rogramme (FP7/2007-201 3) under grant agreemen t n ◦ 214898. 1 hav e their own ideas ab o ut how their softw are is supp os ed to b e co mpiled, or how it should in tera ct with the rest of the sys tem. The EDO S research pro ject (for Envir onment for the development and Dist ri- bution of Op en Sour c e softwar e ) had the ob jectiv e of coming to help and to provide F O SS distributions with b etter to ols to help them do their job. The pro ject was funded b y the Europ ean Co mmission under the IST ( Information So ciety T e chnolo- gies ) activities of the 6th F ramework Prog ramme. Besides several public res earch institutions fr o m differe nt Eur op ean co un tr ies and some small en terprises in the F O SS business there were tw o commercial GNU/Linux distributions in the pro ject: Mandriv a from F rance who is building o ne of the most p opular RP M-based distribu- tions, and Caixa M´ agica from Portugal who is well-known in Portuguese-sp eaking countries. This distribution is again RPM-based, and also upstream author of the apt RPM to ol. F or the successo r pro ject Manco osi (for Managing the Complex- ity of the Op en Sour c e Infr astruct ur e ) Pixart from Argentina joined in with its Debian-based distribution. E DOS started in Octob er 2004 and ended in June 200 7 . Manco osi star ted in F ebruary 2008 for a duration of 3 years. The E DOS pro ject was relatively bro ad in scop e and had w or kpack ages on the following sub jects: • formal manage ment of s oftw are dependencie s • flexible testing framework • pe er-to-p eer co nt ent dissemination s y stem • metrics and ev aluation W e will in this pap er let the last three of these w o rkpack age s aside since the au- thors hav en’t been involv ed in these, and present fr om EDOS only the workpac k ag e on dependency manag ement. W e decided to focus on the problem of distr ibutio n coherence from the r e lease manager ’s p oint of view, and there in on o ne basic ques- tion: Is it p os s ible, for a g iven user selection of pack ag es, to install these when only the pac k ag es from this repo sitory a re a v aila ble ? W e were only taking in to account pack age re la tionships that a re ex pressed by the metada ta of pack age s (that is in Debian: the co ntrol file). Relev a n t results and applications for Debian will b e presented in Section 2. The successor pro ject Manco osi aga in has several workpack a g es. The stream on dep endency management takes off where E DOS has ended and tries to extend our previous res ults to build b etter to ols for the system administr a tor who wan ts to p erform a system upgrade or pack age installa tion on a real sys tem. Mor e ab out this will be discussed in Section 3. EDOS has dev elo ped its own terminology which Manco osi con tinues to use: Installer A to ol to unpack and configure, upgra de, or r emov e a loca lly av ailable pack age on a lo c a l system. In Debian: dpkg . Meta-Installer A to ol to r e s olve (higher level) u s er requests of installing, upgr ad- ing, or removing pack a ges on a system. This too l will hav e to a c c ess p o ssibly remote pac k ag e s r e po sitories, and construct a sequence of commands for an installer. In Debian: a pt-ge t , ap titud e , dsele ct . Metadata of a pack a g e is the data tha t c a n b e statically (that is, without p er- forming an actual installation) extracted from a pack ag e. In case of Debian this is the con tents of a pack ages contro l file, which flows into APT pack ag e lists ( Pack ages and Sou rces ). 2 Packag e: a Packag e: a Versio n: 1 Versio n: 1 Depend s: b, c|d( > =2) Depend s: b(=2)| b(=3) , c(=3)| d(=2) |d(=3) Packag e: b Packag e: b Versio n: 2 Versio n: 2 Packag e: b Packag e: b Versio n: 3 Versio n: 3 Packag e: c Packag e: c Versio n: 3 Versio n: 3 Confli cts: b Confli cts: b(=2), b(=3) Packag e: d Packag e: d Versio n: 1 Versio n: 1 Packag e: d Packag e: d Versio n: 2 Versio n: 2 Packag e: d Packag e: d Versio n: 3 Versio n: 3 Figure 1: A distribution (to the left) and its expansion (to the rig h t). 2 The P ast: EDOS 2.1 F or malization of Inter-P ac k age Relations One of the first o b jectiv es of the EDOS pro ject w a s to establish a simple mathemat- ical mo del of a (GNU/Lin ux) dis tribution. W e decided to restr ic t ourselves in the context of EDOS to relations b etw een pack ages as they are seen b y a meta-installer. Though the model is ge neral enough to describ e the essential features of common pack aging systems (in particular Debian and RPM) w e will fo cus in the following on the modeling of the pac k ag e relations a s found in Debian. The Debian p olicy lists different pos s ible relations b etw een binar y pack ages: Depends, Recommends, Suggests, Pre-Dep ends, E nhances, a nd Conflicts. The Re- places relation concerns only the installer (not the meta-installer), a nd the sa me seems to b e true for the Breaks relation (which wasn’t included in po licy a nyw ay at the time of the EDOS pro ject). Relations betw een source pack ages and binary pack ages a r e not of in ter est for us. Howev er, w e hav e to take into account Provides (that is, virtual pack ages), and the fact that relations ma y b e disjunctive (e.g ., a|b|c ), and ma y be qualified by constra in ts o n v er sion num b ers. W e decided to ignore r elations that are no t essen tia l for a meta-installer in order to decide a b o ut installability . This elimina tes Suggests a nd Enhances from our lis t of interesting relations, and we also decided to ig nore Reco mmends r elations. Pre- Depends can for our purp oses be identifi e d with Dep ends. This leav es us with Depends a nd Conflicts. The nex t q uestion was how to handle constraints on version num b ers like >= 1:2.3. 4-5 . W e decided to no t complicate our mo del with v er sion num b ers a nd their compa rison, and to expa nd version con- straints: given a pack age in a pack age dep endency we replace it b y the dis junction of a ll versions of that pac k ag e that exist in the curren t distribution. In case of a conflict w e re pla ce the pa ck a ge by the set of all v ersio ns of that pack age. An 3 Packag e: a Packag e: a Provid es: v Packag e: b Packag e: b Depend s: w Provid es: v Depend s: w Packag e: v Depend s: a|b Packag e: c Packag e: c Provid es: w Confli cts: d Confli cts: w Packag e: d Packag e: d Confli cts: c Provid es: w Confli cts: w Packag e: w Depend s: c|d Figure 2: A distribution in volving virtual pack ages (to the left) a nd its ex pa nsion (to the right). V ersio n n umbers are omitted. example of that expa nsion is g iven in Figure 1. This expans ion has the adv antage that we get rid of constr aints on version nu mber s, but it has the drawback that this expansion is a lwa ys r elative to a set of av a ilable pack ag es. This migh t pose a pr oblem when one w ants to ma ke the expansion incremental. F or instance , if the o riginal distribution is extended b y a new version 4 o f pack age d w e would hav e to r econsider in th e expansion all pac k ages that hav e a relation to d . In our example, that means that w e have to change the Depends line o f pack ag e a and a dd |d (=4) . Expansion also intro duces explicitly the virtual pack ag e whic h dep ends on all pack ages that provide it. Spec ia l care has to be taken with conflicts on virtual pack ages as a pack age may a t the same time provide a virtual pack age and conflict with it. Section 7 .4 o f the Debian po licy sta tes that in this case the pack age con- flicts with each pac k age providing that virtua l pack age, with the exception that the pack age do esn’t conflict w ith itself. An exa mple of a n expansion inv olv ing virtual pack ages is g iven in Figure 2. W e can now state the formal definition of a pa ck a ge and a repo sitory: Definition 1 A pac k ag e is p air c onsisting of a n ame and a version numb er. Note that we have not defined what pack a g e names a nd version num b ers a re, it suffices for us that w e can know when tw o names or v er sion num b ers ar e equal (as we ass ume that w e a re working with a n expanded r ep o sitory). Definition 2 A rep ositor y is a tu ple R = ( P , D , C ) wher e P is a set of p ackages, D : P → P ( P ( P )) is the dep endency fun ction (we write P ( X ) for the set of subsets of X ), and C ⊆ P × P is the c onflict r elation. T he r ep ository must satisfy t he fol lowing c onditions: • The r elation C is symmetric, i.e., ( π 1 , π 2 ) ∈ C if and only if ( π 2 , π 1 ) ∈ C for al l π 1 , π 2 ∈ P . • Two p ackages with the same name but differ ent versions c onfl ict, that is, if π 1 = ( u, v 1 ) and π 2 = ( u, v 2 ) with v 1 6 = v 2 , then ( π 1 , π 2 ) ∈ C . 4 In this definition, the function D yields for a ny pack a ge the set of all its dep en- dencies. All these dep endencies must b e satisfied sim ultaneo usly . If any such de- pendenc y is a set with more than one element than this set is under s to o d as a s e t of a lternatives. The last restriction, s ta ting that t wo different versions o f the same pack age a re in an implicit conflict, is sp ecific to Debian (RPM do es note have this a priori restr iction). It is no w straightforw a r d to trans late an expanded Packag es file in to a rep osito r y according to Definition 2. F or the expanded Packag es file on the right of Figur e 1, for exa mple, w e obtain ( P , D , C ) as fo llows: P = { ( a, 1) , ( b, 2) , ( b, 3) , ( c, 3 ) , ( d, 1) , ( d, 2) , ( d, 3 ) } D ( a, 1) = {{ ( b, 2) , ( b, 3) } , { ( c, 3) , ( d, 2) , ( d, 3) }} D ( b, 2) = ∅ · · · C = { (( b, 2) , ( b, 3)) , (( b, 3) , ( b , 2)) , (( c, 3) , ( b, 2)) , (( b, 2) , ( c, 3)) , . . . } Definition 3 An installation of a r ep ository R = ( P, D , C ) is a subset I of P , giving the set of p ackages instal le d on a s yst em. An instal lation is healthy when the fol lowing c onditions hold: • Abundanc e: Every p ackage has what it ne e ds. F ormal ly, for every π ∈ I , and for every dep endency d ∈ D ( π ) we have I ∩ d 6 = ∅ . • Pe ac e: No two p ackages c onflict. F ormal ly, ( I × I ) ∩ C = ∅ . Definition 4 A p ackage π of a re p ository R is installable if ther e ex ists a he althy instal lation I such that π ∈ I . Similarly, a set of p ackages Π of R is co-insta llable if ther e exists a he althy instal lation I such that Π ⊆ I . Note that b ecause of conflicts, every mem b er of a set X ⊆ P ma y b e installable without the set X being co- installable. O ne can ev en sho w that not co- installable sets o f minimal size can b e arbitra ry large: Let, for a given n umber n , R n be the following rep ository: P = { a 1 , . . . , a n , b 1 , . . . , b n } D ( a i ) = {{ b 1 , . . . , b i − 1 , b i +1 , . . . , b n }} D ( b i ) = ∅ C = { ( b i , b j ) | i 6 = j } In this rep ositor y , e very pack a ge a i depe nds on the disjunction of all pack ag e s b j with j 6 = i . Hence, any incomplete collection of pac k ages a is co-ins ta llable: if pack age a i is a pa ck a ge missing from that collection then w e can simply s atisfy a ll depe ndencie s by installing pack age b i . Installing all pa ck a ges a together, howev er, would requir e to install at least tw o different pa ck a ges b . Since any tw o different pack ages b a re in conflict this is not p os s ible. The desirable p r op erty that w e wan t to ens ure for a repositor y R is the follo wing: Definition 5 A r ep ository R is trimmed if every p ackage π ∈ R is inst al lable with r esp e ct to R itself. In Debian lingo this translates to the fact that no pac k age in the repos itory is “bro ken”, i.e. that there is at least one p ossible installation in which any g iven pack age is installable. If this is no t the case then that particular Debian distr ibution will b e shipping pack ages that users will never b e able to install. 5 2.2 Results, T o ols, and Applications 2.2.1 Res ult: Install ability is NP-complete Based on the formaliza tion given in Section 2.1 o ne can now quite ea sily show that the pro blem whether a giv en pac k ag e is installable in a giv en repos ito ry is logarithmic-s pace eq uiv alent to the famous SA T problem. This means tw o things: 1. One can construct for an y installa bilit y proble m a SA T problem such th a t the former has a solution if and o nly the latter has a so lution [EDO05, MBC + 06]. 2. One can co nstruct for a ny SA T problem an ins tallability problem suc h that the for mer has a solutio n if and only the latter has a solutio n [EDO06]. The “log arithmic space” qualifier mea ns that the construction can b e do ne with auxiliary memo r y of size lo garithmic in the size of the given problem. This is necessary to transfer complex ity results from one pro ble m to the other. F or instance, in order to trans late an installability pro blem into a SA T pro blem we will interpret a pack age p as a B o olean v ariable with the intuitiv e meaning that pack age p is installed in the chosen solution. Dependencies are transla ted as implicatio ns : If pa ck a ge p dep ends on a,b,c |d,e| f (whic h w ould b e written D ( p ) = { a, b, { c, d } , { e, f }} acco r ding to Definition 2) then this trans lates to the Bo olean implication: p → a ∧ b ∧ ( c ∨ d ) ∧ ( e ∨ f ) A conflict, sa y betw een pack ages a and b , is expressed as the for mula ¬ ( a ∧ b ). The formula p expresses that the pack ag e p has to installed. This enco ding op ens the w ay to using existing SA T solving tec hniques to the r esolution of installability problems (see Sec tion 2.2.2). Since one has reductions in bo th directio ns one o btains an exact w or st-case complexity: Theorem 1 The pr oblem whether a given p ackage is instal lable in a r ep ository is NP-c omplete. On a theoretical level this means that checking installa bility is infeasible in its ful l gener ality . In practice it means as little as that it is a challenging problem since in practice o ne do es not encounter randomly chosen rep ositor ies. The r epo sitories we encounter in realit y hav e a quite particular structure. F or instance we will certainly hav e few pack a ges with a very high n umber of reverse dep e ndencies, and a la rge nu mber with very few reverse dependencies . Indeed, the implementation develop ed in the EDOS pro ject is surprisingly efficien t (see Section 2 .2.2). This a pparent contradiction b etw een theoretical very bad worst-c ase complexity on the one hand and the existence of implementations that ar e sur prisingly fast for sele ct e d pr oblem instanc es is quite co mmon in computer science. 2.2.2 T o ols: edos -deb c he c k, pkglab and ceve The edos-de bchec k utilit y (av ailable in Debian in the pack a ge of the s ame name) takes as input a pack ag e rep ository and checks whether one, several or all pack ag es in the rep ositor y are installable with resp ect to that rep ositor y . This utility is based on the SA T enco ding men tioned in Section 2.2.1 and employs a customized Davis-Putnam SA T s olver [ES04]. Since all computations a re perfor med in-memory and some o f the enco ding work is share d betw een all pa ck a ges considered this is significantly faster than c o nstructing a separate SA T enco ding for the installability of each pa ck a ge, and then running a n off-the-shelf SA T solver on it. F or instance, chec king installa bilit y of a ll pack a ges of main testing/ amd64 takes only 5 se c onds on a dual-cor e amd64 (emitted warnings a b o ut bad pack age v er sion num b ers and other irr egularities are omitted): 6 edos-d ebche ck out Parsin g package f ile... 1.2 second s 21617 package s Genera ting c onstra ints. .. 2.3 second s Checki ng p ackag es... 1 .5 secon ds 4.692u 0.324s 0:05.0 3 99.6% 0+0k 0+0io 0pf+0 w An e xplanation in ca se o f non-ins ta llability is given, see Figure 5 for an exa mple. W e hav e also develop ed an RPM version of this tool called ed os-rp mcheck . pkglab is an in terpreter for a q uery languag e that combines basic queries to edos-deb chec k, resp. edos-r pmchec k, with a functional language whic h allows to use constructions like map to ma nipulate conv eniently lists of pack ages. The interpreter allows to ass ign intermediate results to v a riables. W e are planning for the future a ma jor ov er haul of the quer y language with the goal of mak ing it more useful as a scripting langua ge for applicatio ns like the one describ ed in Section 2.2 .5. The int er preter can loa d re p o sitories that have been pre -pro cessed b y the c eve parser which can par se and a na lyze b oth Debian and RPM rep os itories. The Debian pack age for pkgl ab is p ending while the c eve pack ag e is curr ently a v ailable in exp erimental. 2.2.3 Application: Finding Unins tallable P ac k ages i n Debian edos-d ebche ck is currently used to monitor the sta te of Debian’s distributions ( unstable , testing , stable ), as well as Skolelin ux and Debian GNU/kF reeBSD. The results of the ana lysis are av aila ble at ht tp://e dos.d ebian.net/edos- debcheck . There ar e differe nt rea sons why non- installable pack a g es actua lly exist in these distributions. One imp ortant rea s on is that most of the binary pack ages are ar chi- tecture depe ndent, that is there is one pack age per archit e c ture. As a conse q uence, when accessing the r easons fo r non-installa bilit y of pack ages we ha ve to take int o account a ll p ossible Debian architectures. The meta-data of a b ina ry pack age are gener ated during the pack age compila tio n from the meta -data in the source pack age, and ma y dep end on the actual compi- lation environment or co nditio nal co de in the so urce pack age. As a consequence, the meta data of a pa ck a ge with the same pack age name and version ma y v ar y from architecture to architecture. • The unstable distribution is in fact the staging gro und for building releasable distributions. Pack a g es that dep end on eac h other enter this distribution in an arbitrar y order which dep ends on when a dev elop er uplo ads a pack ag e, or on when a pa ck a ge is co mpiled and uploa ded by an autobuilder (these are daemons that co mpile pack ag es for the v arious architectures). F or instance, pack age a may depend on pac k age b , and the dev elop er of a up lo ads a pac k age for the architecture i386 while the developer of b uploads his pa ck a ge for amd64 (he sho uld hav e tested pack age b using a lo cally built binar y pack age of a o n amd64 ). In this ca se, a is uninstallable in the rep osito ry for i386 until the i386 autobuilder daemon uploads t he bina ry pac k age for b . This is illustrated by Figure 3, the num b ers of uninstalla ble pack a ges in sid are indeed v arying from day to day . As a conseq uence, tra nsient non-insta llability erro rs a re normal in the unstable distribution. P ersis tent errors , how ever, indicate a potential pro blem. • A pack ag e a may depend on pack age b , but b is no t av aila ble on all a r chitec- tures a is av ailable on. This may b e due to the fact that there is a problem with compiling b on some architectures, or that a has a to o libera l ar chit e c tur e sp ecification. 7 • A sp ecial case of the latter is tha t a has its a r chitecture set to all . This indicates a binary pack age that is in fac t the same on all architectures, and hence ex ists o nly once in the pack age po ol. Pac k ag e a may , how ever, dep end on a pack age b which is a rchitecture dep endant but does not exist for every architecture. Introducing a field “Installs- to” in the syntax of co nt r ol files (as prop osed in Bug rep ort #436733 1 ) would allow to fix this. Pac k ages which aren’t installable o n any of the ar chitectures of a distribution are more lik ely due to an err o r. This may happ en with pack ages that are installable in so me a rchitecture that ha s be e n part o f a distribution in the past, but whic h ha s b een removed since then. Another p ossible reason is depe ndency on a pa ck a ge that had to b e remov ed from a distr ibution, for instance due to licensing pr oblems or grav e bugs. 2.2.4 Application: Debian W eather This is more of a fun application. Bas ed on the n umbers of the to o l descr ibe d in Section 2 .2 .3 a “weather rep ort” of Debian is genera ted which indicates the p er- centage of non-installa ble pack ag es for the different distributions and ar chitectures. The interpretation is as follows: clear < 1% few clouds 1% . . . 2% clouds 2% . . . 3% show ers 3% . . . 4% storm > 4% An example weather rep or t is giv e n in Fig ur e 6. Applets for Gno me and KDE ar e av ailable. The daily updated Debian w eather is av a ila ble on the web at h ttp:// edos. debian .net/ weather . 2.2.5 Application: Finding Fi le Conflicts in Debian A Debian installation has the concept of files owned b y pack a ges. If one tries to install a new pack age that would hijack a file owned by another pack age this will make (with some exceptions , see b elow) the insta lla tion fail, lik e this: Unpacking gcc-avr (from .../gcc- avr_1%3a4.3.0-1_am d 64.deb) ... dpkg: error processing /var/cach e/apt/archives/gcc - avr_1%3a4.3.0-1_amd64.deb (--unpack) : trying to overwrite ‘/usr/lib64/ libiberty.a’, which is also in package binutils dpkg-deb: subproce ss paste kill ed by signal (Broken pipe) Errors were encountered while processing : /var/cache /apt/archives/gcc-avr_1%3a 4 .3.0-1_amd64.deb E: Sub-process /usr/bin /dpkg returne d an error code (1) Our aim is to detect these errors by analyzing the Debian distribution, ho p e fully befo re they actually o ccur on a user machine. An obvious na ¨ ıve solution w ould b e to try to install tog ether all pairs of pack ag es that o ccur in the distribution. Debian amd64 / testing ha s currently ab out 21.0 00 pack ages, that would make ab out 200 .000.0 00 pair s of pack ages to test, which clearly is not feasible. 1 http://b ugs.debian.org/436733 8 unstable/main: Date alpha amd64 arm armel hppa hu r d-i386 i386 ia64 m68k . . . some ev ery 22 / 06 949(325) 121(80 ) 604(12 6) 609(10 3) 613(132 ) 4445(1333 ) 228(13 1) 456(12 0) 8943(45 83) . . . 10222(516 3) 41(1 2) ∆ +20 / − 2 +7 / − 11 +22 / − 24 +28 / − 8 1 +24 / − 34 +10 / − 38 +31 / − 7 +26 / − 21 +21 / − 10 . . . +44 / − 5 +0 / − 7 21 / 06 931(31 2) 125(78 ) 606(13 2) 662(11 7) 623(141 ) 4473(1339 ) 204(10 9) 451(12 1) 8932(45 86) . . . 101 83(514 1) 48(1 2) ∆ +44 / − 0 +1 / − 1 +18 / − 7 +52 / − 12 +8 4 / − 0 +44 / − 2 +56 / − 0 +58 / − 0 +34 / − 5 . . . +13 / − 22 +0 / − 1 20 / 06 887(28 7) 125(78 ) 595(12 1) 622(10 8) 539(112 ) 4431(1337 ) 148(92 ) 393(103 ) 8 903(45 85) . . . 10 1 92(515 0) 49(1 3) ∆ +90 / − 5 +6 / − 65 +17 / − 77 +21 / − 1 4 +1 4 / − 63 +15 / − 2 +19 / − 65 +13 / − 64 +26 / − 15 . . . +28 / − 9 +1 / − 2 19 / 06 802(27 3) 184(83 ) 655(12 9) 615(10 9) 588(113 ) 4418(1338 ) 194(94 ) 444(107 ) 8 892(45 83) . . . 10 1 73(514 8) 50(1 3) ∆ +6 / − 0 +2 / − 7 +2 / − 113 +1 / − 8 +5 / − 18 +2 / − 221 +3 / − 3 +5 / − 7 +1 / − 37 . . . +1 / − 20 7 +1 / − 0 18 / 06 796(27 0) 189(87 ) 766(14 5) 622(11 4) 601(120 ) 4637(1380 ) 194(96 ) 446(109 ) 8 928(45 88) . . . 10 3 79(518 7) 49(1 3) ∆ +5 / − 0 +4 / − 8 +115 / − 7 6 +5 / − 64 +0 / − 21 +6 / − 3 +4 / − 1 +1 / − 76 +5 / − 5 . . . +25 / − 2 +0 / − 0 17 / 06 791(26 8) 193(92 ) 727(15 7) 681(14 2) 622(132 ) 4634(1379 ) 191(93 ) 521(132 ) 8 928(45 89) . . . 10 3 56(516 7) 49(1 3) ∆ +12 / − 12 +11 / − 1 +14 / − 57 +15 / − 74 +6 7 / − 105 +4 / − 32 +4 / − 42 +9 / − 67 +16 / − 1 . . . +8 / − 19 +0 / − 1 16 / 06 791(26 3) 183(82 ) 770(17 5) 740(15 4) 660(156 ) 4662(1380 ) 229(96 ) 579(145 ) 8 913(45 75) . . . 10 3 67(517 9) 50(1 3) Figure 3: Summar y of results of running edo s-deb chec k on unstable/ main b etw een June 16 a nd J une 22 , 200 8. The ar chit ec tur es mips , mipsel , p owerp c , s390 , and sp ar c are omitted from this table for lack of space. In each da y ’s listing, the first num b er is the num b er of non-installable pa ck a ges, while the num b er in parentheses is the n umber of non-installable pack ages that are ar chitecture-specific. Lines marked ∆ give the num b er of pack ages b ecoming uninstallable the following day (+), res p. that a r e no longer uninstallable (-). This field is colored red when the total num b er of uninstallable pac k age s is incr easing, green when that n umber is decreasing . Results o f a curr ent run can b e found at ht tp:// edos.d ebian.net/edos- debcheck/unstable.php . 9 testing/main: Date alpha amd6 4 arm armel hppa i386 ia64 mips mipsel p ow e r p c s39 0 sparc some ev ery 23 / 06 36 7(7) 14(2 ) 217(4 ) 348(21 ) 369(9) 12(4) 48 (3) 267 (3) 269(3) 2 1 (3) 56(3) 2 4(3) 628 (32) 8(2) ∆ +0 / − 0 +0 / − 0 +0 / − 1 + 0 / − 0 +0 / − 0 +0 / − 0 +0 / − 0 +0 / − 0 +0 / − 0 + 0 / − 3 +0 / − 0 +0 / − 0 +0 / − 0 +0 / − 0 22 / 06 36 7(7) 14(2 ) 218(4 ) 348(21 ) 369(9) 12(4) 48 (3) 267 (3) 269(3) 2 4 (4) 56(3) 2 4(3) 628 (32) 8(2) ∆ +0 / − 0 +0 / − 0 +0 / − 0 + 0 / − 0 + 0 / − 0 +0 / − 0 +0 / − 0 +0 / − 3 +0 / − 3 +0 / − 0 +0 / − 3 +0 / − 3 +0 / − 0 + 0 / − 0 21 / 06 36 7(7) 14(2 ) 218(4 ) 348(21 ) 369(9) 12(4) 48 (3) 270 (4) 272(4) 2 4 (4) 59(4) 2 7(4) 628 (32) 8(2) ∆ +0 / − 0 +0 / − 3 +0 / − 3 + 0 / − 9 + 0 / − 0 +0 / − 0 +0 / − 0 +0 / − 0 +0 / − 0 +0 / − 0 +0 / − 0 +0 / − 0 + 0 / − 7 + 0 / − 3 20 / 06 36 7(7) 17(3 ) 221(5 ) 357(24 ) 369(9) 12(4) 48 (3) 270 (4) 272(4) 2 4 (4) 59(4) 2 7(4) 635 (35) 11(3) ∆ +7 / − 0 +3 / − 0 +4 / − 3 + 3 / − 27 +4 / − 0 +3 / − 0 +3 / − 0 +5 / − 11 +5 / − 0 +5 / − 0 +5 / − 0 +5 / − 0 +5 / − 16 +3 / − 0 19 / 06 36 0(5) 14(2 ) 220(6 ) 381(31 ) 365(8) 9(3) 45(2) 2 7 6(2) 267(2) 19(2) 54(2 ) 22(2) 6 46(42) 8(2) ∆ +0 / − 0 +0 / − 0 +0 / − 0 + 0 / − 0 + 0 / − 0 +0 / − 0 +0 / − 0 +0 / − 0 +0 / − 0 +0 / − 0 +0 / − 0 +0 / − 0 +0 / − 0 + 0 / − 0 18 / 06 36 0(5) 14(2 ) 220(6 ) 381(31 ) 365(8) 9(3) 45(2) 2 7 6(2) 267(2) 19(2) 54(2 ) 22(2) 6 46(42) 8(2) ∆ +0 / − 0 +0 / − 0 +0 / − 0 + 0 / − 0 + 0 / − 0 +0 / − 0 +0 / − 0 +0 / − 0 +0 / − 0 +0 / − 0 +0 / − 0 +0 / − 0 +0 / − 0 + 0 / − 0 17 / 06 36 0(5) 14(2 ) 220(6 ) 381(31 ) 365(8) 9(3) 45(2) 2 7 6(2) 267(2) 19(2) 54(2 ) 22(2) 6 46(42) 8(2) stable/main: Date alpha amd64 a rm hppa i38 6 ia64 mips mipsel p ow erp c s39 0 sparc some ev ery 23 / 06 18 4(0) 13(0) 96(2) 1 89(0) 0(0) 67(0) 185(0) 18 6(0) 13(0) 183(0) 1 44(4) 235 (6) 0(0) Figure 4: The same s tatistics as in Figure 3 now for testing a nd stable (only one day shown since no v ar iation). 10 Pac k age Since V ers io n Explanation . . . . . . . . . . . . c alendarserver 20 J un 08 1.2.dfsg-3 calend arser ver (= 1.2.df sg-3) depend s on python -twis ted-calendarserver (>= 0.2.0. svn197 73-3) { NOT AVAILA BLE } c amping 21 Jun 08 1.5+ svn242- 1 camping (= 1.5+svn2 42-1) depend s on rails { rail s (= 2.0.2- 2) } rails (= 2.0.2- 2) depend s on rdoc (>> 1.8.2) { rdoc (= 4.2) } rdoc (= 4.2) depend s on rdoc1.8 { rdoc1 .8 (= 1.8.7 .22-1 ) } . . . . . . . . . . . . r do c1.8 21 Jun 08 1.8.7 .22-1 rdoc1.8 (= 1.8.7.22- 1) depend s on ruby1.8 ( >= 1.8.7. 22-1) { NOT AVAIL ABLE } . . . . . . . . . . . . sho es 21 Jun 08 0.r3 96-4 shoes (= 0.r396 -4) d epend s on libge ms-ru by1.8 { libg ems-ru by1.8 (= 1.1.1- 1) } libge ms-rub y1.8 (= 1.1.1 -1) depend s on rdoc1 .8 { rdoc1.8 (= 1.8.7. 22-1) } Figure 5 : An excerpt from the list of uninstallable pack age s in sid/ i386 main for June 22, 20 08. In the explanatio n field, av ailable versions of a pack a ge ar e indicated b e- t ween curly brack ets. Lines may refer to pack a g es shown non- installable elsewhere, like the pack a ges cam ping and shoe s b eing not-installable beca use it need rdoc1. 8 . Pac k age names wr itten in italics in the left column hav e Architecture=all. Results of a current run can b e found a t ht tp:// edos.d ebian.net/ edos- debc heck/r esults/unstable/latest/i386/list.php . Stable: T esting : Unstable: alpha amd64 arm hppa i386 ia6 4 mips mipsel p ow er p c Figure 6: The Debian weather for J une 27, 20 08: Mostly sunny in stable and testing, at places o vercast and r ainy in unstable. 11 A fir st idea tow ards a b etter solutio n is to only consider those pairs of pack ages that actually share at lea st one fi le . Luckily , the informatio n which pack a ge contains which file is av ailable in the file Co ntents of the distribution. This file contains stanzas like ... bin/fb set admin/ fbset bin/fg conso le utils/ conso le-tools,utils/kbd ... etc/de fault /nvidia-kernel contri b/x11 /nvidia-kernel-common ... In this file, info r mation is indexed by path names of the files (omitting the initial slash). F o r ev ery file a c omma separated list of pack ages con taining that file is given where pac k ages are indicated with t heir section (a classification of pack a ges by type, like games or admin ), and probably the component if it is different from m ain (which can curre n tly be c ontrib or non-free ). F o r instance, the file /b in/fg consol e is provided b y the pack ages conso le-too ls and kbd which both ar e in section utils . In fa ct the Co ntents file that can b e found on a Debian mirr or may b e slightly out of da te as this file is ge nerated only once p er week. The C ontent s file of a md64/testing (as of May 200 8) contains ab o ut 2.3 00.00 0 ent r ies. It is a triv ial progra mming exercise to compute from this file a lis t of pairs of pa ck a ges tha t shar e at lea st one file. Sharing a file do es not necessarily mean a bug. Ther e a several rea s ons why it may b e OK for t wo pack ages, say A a nd B , to share a file, say F: 1. The t wo pack ag es are not co -installable by the pack ag e relationships declared in their distribution, in the sense of Section 2 .1. 2. One of the pac k ages, say A , declares tha t it ha s the rig ht to replace files o wned by B , by having in its co ntrol file a stanza Re place s: B . 3. One of the pack ages, say B , diverts the file F that it shares with pack a ge A . This means that if pac k age B is being installed on a system already con taining pack age A then A ’s version o f file F will be renamed; file F will b e restor ed to its original name when pac k age B will be remov ed. File diversions ar e declared by in voking the to ol dpkg- diver t fro m a maintainer scr ipt which will simply register the div er sion request in a system- w ide database. This database is consulted by d pkg when installing files. Diversions a re no t declared in the pack age control file. W e pro ceed in tw o stages in order to find the ac tua l file o verwrite problems: 1. Co-ins tallability is ch e cked with the pkgla b to o l (see Section 2.2.2). This is the only to ol that can detect “ deep” conflicts b etw een pack a ges. T his first phase gives us a reduced list of pairs o f pack ag es. 2. Knowing which files are diverted by a pack a ge poses different problems: di- versions ar e registered by the so-called p ostinst script of a pack ag e, o ne of the maintainer scripts that are executed during installa tion (or upgrade , or remov al) of a pack age. This leads to tw o problems: (a) E xecution of the p ostinst sc ript de p ends on the current state o f the system, and can in general not be describ ed by a simple list of files. (b) The p o s tinst script is written in a T uring complete lang uage (usually Posix shell or bash), which means that ex act semant ic prop er ties a re undecidable. 12 F or this reason, we try in the sec ond pha se to install each of the pairs o f pack ages remaining after the first phase in a chroot, using apt-g et install . W e then sear ch the insta ll log for file overwrite erro rs. The following statistics is from the first run perfor med on April 16th, 20 0 8, on amd64/sid: Theoretical pair s of pack a g es according to the distribution 200.00 0.000 Pairs o f pack ages sha ring a file according to Content s 867 Co-installable pair s among these accor ding to pkgl ab 102 File overwrites detected 27 Checking co - installability with EDOS pkglab to o k 30 minutes and gave a 88% reduction o f the s earch space. T esting the installa tion of the rema ining 102 pa irs of pack ages still took 2.5 hours. This meas ures where tak e n with a dual-core amd64 at 1.6GHz, using a loca l Debian mirro r access over a fa s t LAN. Detected bugs are trac ked in the D e bia n bug tracking sys tem, and marked there with user treine n@debi an.org and usertag e dos-fi le-ov erwrite . 3 Presen t and F uture: Manco osi 3.1 An Ov erview of the Ma nco osi Pro ject Manco osi picks up the ba to n from where EDOS left it. So, where to go from E DOS? Even though s ome of the theoretical achievemen ts of E DOS still ha ve some wa y to go befor e reaching the prac tice of all dis tributions (including Debian), a doption of EDOS results is o ngoing and is actually extending past the distribution univ er se; a noteworthy ex ample is the Eclipse pla tform, which is moving to SA T solving to solve inter-plugin dep endencies. On the con trar y , one side of the complexit y issues in tro duced by the o verwhelm- ing amount of pack ages in GNU/Lin ux distributions has been neglected by E DOS and is still in need of b oth rese a rch and tool developmen t: the user side o f a dis- tribution. While EDOS has fo cused o n the distribution e ditor side (i.e. on who is actually creating the dis tributions), Mancoos i fo cuses on who is a ctually us ing a distribution, in particular syst em administr ators . It is well-kno wn that distributions ra ise difficult pr o blems for administrato rs. Distributions evolve rapidly b y integrating new versions of softw ar e pa ck a ges that are indep endently developed. System upgrades may pro c e e d on different paths depe nding on the cur rent state o f a sy stem and the av a ilable so ft ware pack ages , and sy s tem administrator s are faced w ith choices of upgrade paths, and p ossibly with failing upgrades. All together, these intert wined problems ar e referre d to as the up gr ade pr oblem . The Manco osi pro ject aims at developing tools for the system administrator that address the upgrade pro blem. What does cons titute an upgrade problem from the p oint of view of a sy stem administrator ? Intuitiv ely , any p ossible change to the da ta base of lo cally installed pack ages constitutes an upgra de problem. Suc h ch a nges are usually r e q uested to a meta-installer and are well-known to a ny sys tem-administrator . So me examples: • apt-g et i nstall wesnot h • aptit ude upgrad e cappu ccino • apt-g et d ist-up grade • aptit ude purge emacs22 13 • wajig instal l vim-fu ll Each of the above examples poses a simple upgrade proble m. W ay mo r e complex upgrade problems can be formed by combining s impler pr oblems (e.g . po sing a ll th e ab ov e requests together to a single meta- installer). Y et more co mplex problem can be created by exploiting meta-ins ta ller sp ecific features such as requiring spec ific pack age versions or o rigin suites (think at ap t pinning). A basic principle of the Manco osi pro ject was that the upg r ade pro cess can b e decomp osed in to t wo par ts : dep endency resolution and upgr ade deployment. While depe ndency r esolution can be thought of as a static phase, wher e without altering the pack a ge database a meta-installer has to figur e out if and how to implement the user r equest, upgrade deploymen t is mor e dynamic and co nsists of several s ub- activities: pac k age download, pack age unpacking, maintainer s cripts execution . . . According to this distinction, the tw o main av enues pursued b y Ma nco osi a re: rollbac k supp ort Upgrade deploymen t can fail for v ar ious rea s ons easily enco un- tered in system administrator nightmares (disks running out of space, 404 while do wnlo ading a pack a ge, main ta iner script failures, file o verwrites among unrelated pack ages, . . . ). Dep ending o n how bad the error is, a co mmo n at- tempted solution is that of r ol ling b ack the system, par tially or completely , to a s afe s tate which predates the upgra de attempt. Unfortunately , suppo rt for upgrade attempt rollback is basica lly inexistent in state o f the art installers. Note that the need for a rollback may a lso o ccur s ome time after an upgrade (even days or weeks), and that in that case one only wan ts to undo the pa ck- age upgrade but not any other s ystem changes that ha ve been applied in the meantime. This means tha t w e are looking for solutions b eyond mer e file system snapsho ts. Manco osi aims at developing mechanisms tha t provide for r o llback of failed upgrade attempts, allowing the system administr a tor to re vert the system to the state b efore the upgr ade. In par ticular, ro llback is the topic of Manco osi work pack ag es 2 and 3. 2 dep endency solving The first part of the upgra de pr oblem is implemented by state of the ar t meta-installe r s, but each of them has de ficie ncies (e.g. incom- pleteness: the inability to find an upgra de path each time one upgrade path do es exists). Manco osi aims at developing b e tter algor ithms to plan upgra de paths based on v ar ious information sources ab out soft ware pack a ges and on o ptimization criteria. Dep endency so lving is the topic of Ma nco osi work pack a ges 4 and 5. As the author s are o nly mar ginally inv olved with ro llback supp ort, that pa rt of the pro ject will not be discussed an y further in this pa p e r. W e will for the re st of this pap er concentrate on depe ndency so lving. 3.2 Dep endency solving As alr eady ment io nend, t he o verall goal of this pa rt of Ma nco osi is impr oving dep en- dency solving in state of the art meta-installer s, solving some of their deficiencies. More prec is ely , Mancoo si plans to addr ess three requirements which are b elieved to define the ideal to whic h an y giv en meta-installer should tend to: c o mpleteness, optimality , efficiency . 2 http://w ww.mancoosi.org/work.html 14 3.2.1 Compl eteness The fir s t of these requirements can b e defined as fo llows: Definition 6 A meta-instal ler is complete wrt. dep endency solving iff for e ach p ossible up gr ade pr oblem which has a solution, the meta-instal ler is able to find such a solution. Even t ho ugh not eno ugh details hav e b een given to fully forma lize completeness in this pap er , the intuition should b e clea r: once the sy stem administrator p ose s an upg rade problem to its meta-installer of choice, the meta-installer tries to solve depe ndencie s to fulfill the user request to determine which c ha nges should b e made to the set of insta lled pack ag es. If a healthy installation satisfying the user reques t do es exis t, then the meta-installer sho uld b e able to prop os e it a s a pos sible wa y of fulfilling the user req ues t. Surprising as it mig ht sound, most state of the ar t meta-installer s are not com- plete. F or insta nce, up on receiving a request like inst all p , a pt-ge t always tries to install the latest version of p among those av a ilable in the pack age universe formed b y APT re po sitories. In c a se the version require men ts of (latest) p are not satisfiable it might well be that r equirements of (previous) p are indeed sa tisfiable. In such a nd similar cases the user is left with the feeling that there is no way to satisfy her request, while this is actua lly not the case: this is a lack of completeness that sho uld b e a ddressed to improv e user exp er ience with meta-installers. Note that the given example is just a paradigma tic one, more complex examples built on top of the limited bac k-tra cking capa bilities of other meta-installers can also be provided [EDO06] (see also http:// www.ma ncoosi.org/edos/manager.html for an a na lysis of the situation in the year 20 06). The gener al p oint stressed here is tha t legacy meta-installers which are advertised as the too ls for sy s tem-administrator s to interact with the pac k age da ta base of their machines should b e able to solv e depe ndency problems each time it is p oss ible to do so. 3.2.2 Optim alit y Once it c a n b e tak en for gran ted that a ny p oss ible solution to a dep endency problem can be found, it is na tur al to ask which among all the p ossible solutions has to b e preferred o ver the others. Note that for any given upg r ade problem there are in genera l several pos s ible solutions. If you consider again the insta ll p req uest p ose d to apt-get ab ov e, a po ssible solution for it is to install the version of p whose dep endencies a re satisfiable together with all its (transitive) dependencies and be done with tha t. Another v alid solution is to install the same set of pack ages together with a pack a ge z which is completely unrelated to p and that does not inhibit a health y installatio n. Where a s in these t wo cases it seems obvious that the former has to b e pr eferred, in the general case there are non obvious choices to b e made. Any one who has alrea dy b een faced with apti tude interactive solution discriminatio n kno ws that: in satisfying depe ndency problems coming from use r requests, tr ade-offs hav e to be made. In fa ct, even b efore discussing how the optimal so lution has to be found amo ng all a lternative solutions of a given upgrade problem, there is a need to understand which criteria should b e used to define the o ptimality of a given solution. At the moment some fixed criter ia which are likely to addr ess most user nee ds are being considered; he r e is a handful of examples: • minimize the amount of extra-pack ag es installed with resp ect to those explic- itly mentioned in the user re q uest, • minimize the download size of pack ag es required to deploy the upgrade solu- tion, 15 • minimize disk usage a fter the upg rade (a fr equent need for Debian- based em- bedded distributions), • upgrade as man y pac k ag es as p ossible to the la test av aila ble version. • . . . Of co urse differen t optimization criter ia c an b e in conflict one with another. If on one side this brings the upgrade pro blem in the v ibrating resear ch field of mult i- criteria o ptimization, it a lso raises the issue of which interface should b e given to users to sp ecify their optimization pr e fer ences. Moreov er, the se t of p ossible opti- mization criteria s hould be op en-e nded as sp ecific user needs ar is e every day: APT pinning is a pr actical example of use r r equests that should be taken into a ccount while c ho o sing an optimal so lution, count les s other user-sp ecific r equirements can be imagined (e.g.: when you have a choice among t wo pac k ag e s c ho ose the one with less R C bugs, o r ev en blac klist pack a g es maintained by Random J. Dev elop er as you don’t trust him . . . ). F or this r eason Mancoos i will also be developing a c r oss meta- installer la nguage to spec ify optimization cr iteria with a well-defined semantics, to be used b y s ystem-administrato rs to sp ecify their preferences. 3.2.3 Efficie ncy Once it is settled what prop erties we wan t from the ability of a meta-insta lle r to solve dependencies (completeness and optimalit y), the attention can be turned to how we would lik e the given to ol to reach a so lution . . . and of course w e wan t it to b e efficient in finding it. Even letting aside the optimization par t, dep endency solving is per se a NP-complete problem (see Section 2.2.1) hence w e cannot hop e for a definitiv e algor ithm or implementation delivering upgrade problem solution instantaneously in any g iven ca s e. Nevertheless w e should strive for the most p ossible efficiency and in this res pe c t the EDOS results have b een encouraging . Manco os i will fo cus on finding efficient algorithms whic h not o nly take into a ccount pack a ge ins ta llability “in the v oid” (i.e. in so me, not s pec ifie d a prior y , installation), but rather which addre s s upgrades starting fro m an exis ting user installation. 3.3 A solv er comp etition Promising to find the most efficien t algor ithmic so lutio n to the upg rade problem, implemen ting b oth completenes s and optimality in the setting o f the Manco o si pro ject w ould hav e b een inconsider ate. This is why Ma nc o osi choo s es a differen t path: try increa s ing the sensibilit y of the relev a nt resear ch communities on the upgrade problem. Histor ic ally , the organiza tio n of p erio dic c omp etitions has b een a training factor in pushing fur ther the state of the art in algorithms a nd too ls for complex pro blems s uch a s SA T. Examples like the SA T comp etition 3 and SA T race 4 attract y ea rly re s earch and practitioners willing to challenge their to o ls w ith comp etitors to deter mine which is the “b est” bo th in terms of solver capabilities and in terms of e xecution sp eed. Manco osi will follow a similar path for the upgrade problem faced routinely by meta- installers. A compe titio n of depe ndency solvers will be o rganized and is planned to b e held in pa rallel with a rese arch confer ence o n related fields (SA T- solving, linea r optimization, . . . ). While it is to o ea rly to hav e detailed infor mation on how the comp etition will b e run and organized, some aspe c ts are already clear. 3 http://w ww.satcompetition.org/ 4 http://w ww- sr.info rmatik.uni- tuebingen.de/sat- race- 2008/ 16 Figure 7: Data flow of UPDB submissions, from us e r s to the cor pus of problems for the comp etition Upgrade problem database T o run a solver comp etition you need a corpus o f problems that will b e used to challenge the v ario us comp etitors. In the Manco o si case the corpus will b e c a lled UPDB for Upgrade Pro blem DataBase. The wa y in which it will be as sembled is different from other comp etitions. Instead of creating artificial problems by hand (that would be not o nly challenging given the typical size of a distribution rep ositor y , but also b ear the risk of crea ting ir relev ant pro blems) the corpus will be compose d of problems submitted by users who encountered these. All in all, the architecture is simila r to that of the Debian Popularit y Contest: 5 users interested in pa rticipating will b e asked to insta ll s ome sp ecial-purp ose pack- ages whic h pr ovide the soft ware to gather data and submit it t o a cen tra l rep os itory . In some case s it will probably b e necessary to install mo dified v ersio ns of meta- installers which hav e b een changed to log enoug h information to fully describ e an upgrade pr oblem. The a r chitecture o f pro blem submission to UPDB is depicted in Figure 7. As v a rious distributions a r e taking part in the Manco osi comp etition, each of them w ill b e providing a staging r ep ository to whic h problem submissions will b e addressed. One such rep ository w ill be set-up for Debian users as well. As the format of the initia l submission is distribution-sp ecific, a further conversion step int o a co mmon format used to encode problems is needed. Once the conv ersio n has b een done, the upgrade problem is fully abs tracted over the orig in distribution and c an be fed as input to the v ario us solvers which will b e taking part in the comp etition. The Manco osi pro ject will b e b oth o rganizing the comp etition (and this is the topic o f work pack age 5) and participating in it (w or k pack age 4) with a resea rch team which is expert in SA T s olving and optimization techniques and whic h will b e developing ad-ho c algor ithms for the upgrade problem as faced in distributions. T yp es of comp etitions Different kinds of competitio ns will b e held. In the be- ginning it is planned that the optimization criteria will b e fixed and each comp etitor 5 http://p opcon.debian.org/ 17 will sp ecifically be participating in a s election of them. F or example it is likely that we will b e having categories like: no optimization (just so lve the upgrade problem no ma tter what), minimize the do wnlo a d size of required pack ages , minimize disk usage, a nd so o n. Upgrade Description F ormats As it can b e obse r ved in Figure 7, different for- mat sp ecifications are required b efore b eing able to star t collecting upgra de prob- lems from users (that not withstanding sp ecification implemen ta tio ns, which will be required as well). Suc h sp ecifications a r e work in prog ress and ar e av a ilable in the Manco osi public rep ository av ailable at http://gfor ge.info.ucl.ac.be/ pl ugins/ scmsvn/vie wcvs.php/trunk/upd b /doc/cudf/?root=mancoosi . The first sp ecification D UDF (D istribution Up gr ade Descrip t ion F ormat) is meant to descr ib e th e format used for the actual submiss ion o f upgrade pro blems fro m user machines to the rep ositorie s set up by each distribution in ter ested in co llecting up- grade problems. As the format is in the end distribution-sp ecific, the sp ecifications describ e the ov erall structure and basic principles of a submission document, the ac- tual details will b e filled in b y each distribution acco r ding to the us er installers and meta-installers. Interested distr ibutions a re encoura ged, o nce the final version of DUDF will b e ready , to publish notes describing ex actly ho w they a re implementing the distribution- sp ecific part of DUDF. Roughly , a DUDF do cument has the follo wing pa rts: 1. lo cal pack a g e status on the user machine 2. curr ent pack age universe as known to the meta-installer 3. reque s ted action 4. user desiderata (i.e. optimization criteria) 5. v arious ident ifier s (e.g.: distribution identifier, ins taller name and version, meta-installer name and version, . . . ) 6. outcome of the meta-installer (a new lo ca l pac k ag e status in case of suc c ess, a failure message otherwise) A hypothetical (a nd incomplete) mapping to Debian for the apt-g et , just to give a practical intuition of wha t can constitute a DUDF submission, is as follows: 1. /va r/lib /dpkg/ status 2. the set of AP T binary pack ag e lists a s s to red under /var/ lib/ap t/lists/ 3. the given APT command 4. curr ent APT pinning settings 5. “debia n” , “apt-get”, v x.y.z , “dpkg”, . . . 6. “br o ken pack a ges, the fo llowing pack a g es can not be installed, . . . .” As sending all the a bove info r mation can b e costly in terms of s ubmission size, DUDF implemen ts some space - optimizations. The most imp ortant optimization is based o n the a s sumption that mo st pack age lists comp osing a given pac k ag e universe are usually only mirrored on a lo c al machine and are a v ailable elsewhere. Hence, by keeping distribution-specific historical mirrors of a giv en distribution, instead of s ending whole pack a ge lists, a DUDF submission may just c o ntain pack age list chec ksums tha t ca n later b e lo oked up in historica l mirrors to recr eate the pack age lists a s a v ailable on user machines. In the specific case of Debian, Ma nco osi will b e keeping historical mirr o rs of APT lis ts for the mos t w ide s pread apt- get rep ositories : not only the official stable/testing/uns table Debian suites, but a lso volatile, backports, debian-multimedia, . . . 18 The second, a nd last, document format inv o lved with the solver comp etition is CUD F (Common Up gr ade Description F ormat) . That is the format in which the actual inputs from comp etition participants will be enco ded in. Co nt r ary to DUDF, CUDF is distribution agnostic as well as agno stic to any specific installer or meta-ins taller. A requir ement for an y given DUDF do cument is that it can be conv erted to CUDF, during that conv er sion step a ll perfo rmed space- o ptimization will b e expanded to obtain a s elf-contained de s cription of an upgra de problem. 3.4 Debian and Manco osi As a lready mentionend ther e is no “o fficial” relatio n b etw een the Manco o si and Debian pro jects; how ever, there are Debian develop ers in the ranks o f Manco osi which a re in terested in giving back to Debian a s m uch as possible of Manco os i achiev ements. This section lists the foreseeable points of contact betw een Manco osi and Debian, it als o points to the av a ilable res ources for interacting with Manco o si from the Debian side. Probably the main po int of interest for Debian in Manco os i is the p ossibility to improv e the av a ilable algorithms and to o ls fo r dependency solving, bo th fro m the po int o f view of p erforma nce and the po int of view of capabilities. T o be deliv e r ed in Debian, the p ossible forthcoming a chiev ements will nee d co o pe r ation among the algorithm developer s and the develop ers of meta- installers used in Debian (apt-get, aptitude, . . . ). The Debian dev elop er s inv olved in Ma nco osi have alre ady taken contact with members of the resp ective dev elopment tea ms. Collabo rations are needed mainly in tw o areas: common solv er API It is unlikely that Manco o s i will have the ener gy to p ort nov el depe ndency r esolution algor ithms to mult iple meta- installers, it is mor e likely that only a pr o of of concept implementation for a single to ol will b e developed. As Debian is also ab out diversity , it would be preferable to hav e implemen ta tions for all the mainstream meta-installers. T o this end a side - result that will b e pur s ued is the developmen t of a co mmon API to let what- ever meta- installer interact with an ext ernal dep endency solver . This wa y it would be p oss ible to develop separ ately meta-installe r s and plug them into different to o ls. Such a n achiev ement, if reached, would also mean that it will b e p ossible to exchange solvers whic h already exist a mong different to ols, gaining flexibility in the ov era ll pack ag e manager implementation. dep endency solving logging O nce the sp ecification of DUDF will be finalized, its implemen tatio ns will basica lly co nsist o f patches (or plug ins , where fea- sible) for meta-installer s enabling them to sav e in DUDF format so lving at- tempts or iginated from upg rade problems. As it will b e beneficia l to hav e a common forma t for logging such attempts (e.g. for bug rep o rts against apt- get, aptitude, . . . ) we hop e to sprea d DUDF implemen tatio ns in whatever meta-installer is currently used in Debian. On a less implementativ e side, Manco osi is welcoming comments from the De- bian communit y on a ll asp ect of the pro ject. In particular, at the time of this writ- ing we a re interested in comment s on what will constitute inter est ing optimization criteria as those an ticipated in Section 3.2.2. The corpus of collected o ptimization criteria is lik ely t o be used as th e set o f catego ries to r un the first solver comp etition. Do not hesitate to get in to uch with the Manco osi pro ject if y ou have sugges tions on this topic or o n anything else r elated to the pro ject! T o get in touc h with Manco os i there are v ar io us wa ys. • The official website giv es ge neral information on the Manco osi pro ject, it is av ailable at ht tp:// www.ma ncoosi.org 19 • The mailing list to ar chiv e public discussions ab out Mancoos i is manco o si- discuss: ht tp:// sympa. pps.jussieu.fr/wws/info/mancoosi- discuss • Then there are also Debian-sp e cific c ontacts – http:/ /manc oosi.debian.net ha s b een set-up as a w eb archiv e of r e- sources for the Debian pro ject offered b y Manco osi. At the moment it just contains the histor ical mirror of APT’s binary pack age lists w hich will b e used to implemen t the space- o ptimization of DUDF. It also con tains a n apt-g et rep ositor y of unofficial Debian pac k ages meant as a stag ing are a for pack ag es not (y et) accepted in the Debian archiv e, or simply not suitable/interesting enough fo r it. – the email contact debian@m ancoo si.org is the main contact to get in touch with Manco osi for Debia n-related issues, questio ns, comments . . . Drop a mail to it for more informatio n! References [EDO05] EDOS Pro ject W orkpack a g e 2 T eam. Report on formal ma nagement of softw are dep endencies . EDOS Pro ject Deliverable W ork Pack age 2, Deliv- erable 1, September 2005 . h ttp:// www.e dos- project.org/xwiki/bin/ Main/D elive rables . [EDO06] EDOS Pro ject W orkpack a g e 2 T eam. Report on formal ma nagement of softw are dep endencies . EDOS Pro ject Deliverable W ork Pac k age 2, De- liverable 2, March 2006. ht tp://w ww.ed os- p roject.org/xwiki/bin/ Main/D elive rables . [ES04] Niklas E´ en a nd Nikla s S¨ orens s on. An extensible SA T-solver. In En- rico Giunchiglia and Armando T acchella, editors, Th e ory and Appl ic ations of Satisfiability T esting, 6th International Confer enc e, SA T 2003. Sant a Mar gherita Ligur e, Italy, May 5-8, 2003 Sele cte d R evise d Pap ers , vol- ume 2919 of L e ctu r e Notes in Computer Scienc e , pages 502–518 . Springer, 2004. [MBC + 06] F a bio Mancinelli, J aap Bo ender , Rob er to Di Cosmo, J´ erˆ ome V ouillon, Berke Durak , X avier Leroy , and Ra lf T re inen. Manag ing the complexity of large free and o pe n so ur ce pack ag e - based so ft ware distributions . In ASE 2006 , pa ges 199– 208, T okyo, Japa n, Septem b er 2006 . IE EE CS Press. 20
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment