root/branches/mk/README

Revision 52, 15.4 kB (checked in by mk, 4 years ago)

README update.

Line 
1 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2 Cheesecake: How tasty is your code?
3 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
4
5 .. contents:: **Table of Contents**
6
7 Summary
8 -------
9
10 The idea of the Cheesecake project is to rank Python packages based on various
11 empirical "kwalitee" factors, such as:
12
13  * whether the package can be downloaded from PyPI given its name
14  * whether the package can be downloaded from a full URL
15  * whether the package can be unpacked
16  * whether the unpack directory is the same as the package name
17  * whether the package can be installed into an alternate directory
18  * existence of certain files such as README, INSTALL, LICENSE, setup.py etc.
19  * existence of certain directories such as doc, test, demo, examples
20  * percentage of modules/functions/classes/methods with docstrings
21  * percentage of functions/methods that are unit tested (not currently
22    implemented)
23  * average pylint score for all non-test and non-demo modules
24
25 Currently, the Cheesecake index is computed for invidual packages obtained
26 through a variety of methods (detailed below). One of the goals of the
27 Cheesecake project is to automatically compute the Cheesecake index for
28 all packages uploaded to the PyPI Cheese Shop (possibly at upload time) and
29 to maintain a collection of Web pages with statistics related to the
30 various indexes of the packages.
31
32 Cheesecake currently computes 3 types of indexes:
33
34  * installability index
35  * documentation index
36  * code kwalitee index
37
38 The algorithms for computing each index type are detailed below.
39
40 Why Cheesecake?
41 ---------------
42
43 The concept of "kwalitee" originated in the Perl community. Here's a relevant
44 quote:
45
46   *It looks like quality, it sounds like quality, but it's not quite quality.*
47
48 Kwalitee is an empiric measure of how good a specific body of code is. It
49 defines quality indicators and measures the code along them. It is currently
50 used by the `CPANTS Testing Service <http://cpants.dev.zsi.at/index.html>`_
51 to evaluate the 'goodness' of CPAN packages.
52
53 Since the Python package repository (aka `PyPI <http://www.python.org/pypi>`_)
54 is hosted at the Cheese Shop,
55 it stands to reason that the quality indicator of a PyPI package should be
56 called the Cheesecake index!
57
58 Usage examples
59 --------------
60
61 To compute the Cheesecake index for a given project, run the cheesecake.py
62 module from the command line and indicate either:
63
64  * the package short name (e.g. twill) or
65  * the package URL (e.g. http://darcs.idyll.org/~t/projects/twill-0.7.4.tar.gz) or
66  * the package path on the file system (e.g. /tmp/twill-latest.tar.gz)
67
68 In all cases, the cheesecake module will attempt to download the package
69 if necessary, then to unpack it in a sandbox directory (/tmp/cheesecake_sandbox
70 by default). If either of these operations fails, the Cheesecake index for
71 the package will be 0. If the package can be successfully unpacked, the
72 cheesecake module will compute the values for a variety of indexes detailed
73 in the algorithm given at the end of this file.
74
75 If the package can be successfully downloaded and unpacked, a log file is
76 created in the sandbox directory and named <package>.log (e.g. the log file
77 for twill-0.7.4.tar.gz is /tmp/cheesecake_sandbox/twill-0.7.4.tar.gz.log).
78 The log file is not automatically deleted after the Cheesecake index is
79 computed, since its purpose is to be inspected for debug information.
80
81 Command-line examples:
82
83  1. Compute the Cheesecake index for the Durus package by using setuptools
84     utilities to download the package from PyPI::
85
86       python cheesecake.py --name=Durus
87
88  2. Compute the Cheesecake index for the Durus package by indicating its URL::
89
90       python cheesecake.py --url=http://www.mems-exchange.org/software/durus/Durus-3.1.tar.gz
91
92  3. Compute the Cheesecake index for the twill package by indicating its path
93     on the local file system::
94
95       python cheesecake.py --path=/tmp/twill-latest.tar.gz
96
97  4. To increase the verbosity of the output, use the -v or --verbose option.
98     For more options, run cheesecake.py with -h or --help.
99
100 Obtaining the source code
101 -------------------------
102
103 There is no release for Cheesecake yet, but you can get the source code via svn::
104
105   svn co http://svn.pycheesecake.org/trunk cheesecake
106
107 *Note*: make sure you indicate the target directory when you do the svn checkout,
108 otherwise the cheesecake package files will be checked out directly in your
109 current directory.
110
111 The source code is also available with [http://pycheesecake.org/browser browsing]
112 on project's Trac wiki.
113
114 You may want to modify your subversion client configuration to automatically
115 expand tags, like $Id$, $Author$ etc. To do so add following two lines to your
116 ``/.subversion/config``::
117
118   enable-auto-props = yes
119
120 in [miscellany] section, and::
121
122   *.py = svn:eol-style=native;svn:keywords=Author Date Id Revision
123
124 in [auto-props] section.
125
126 Mailing list
127 ------------
128
129 Developer mailing list: http://lists.sourceforge.net/lists/listinfo/cheesecake-devel
130
131 License
132 -------
133
134 Cheesecake is licensed under the Python Software Foundation license,
135 the same license that governs Python itself. The text of the license is
136 available in the ``LICENSE`` file in the source code distribution and
137 can also be downloaded from
138 http://www.opensource.org/licenses/PythonSoftFoundation.php.
139
140 Author contact info
141 -------------------
142
143 Grig Gheorghiu
144
145 :Email: <grig at gheorghiu dot net>
146 :Web site: http://agiletesting.blogspot.com
147
148 Michal Kwiatkowski
149
150 :Email: <ruby at joker.linuxstuff.pl>
151 :Web site: http://joker.linuxstuff.pl
152
153 Algorithm for computing the Cheesecake index
154 --------------------------------------------
155
156 The cheesecake.py module uses the following constants::
157
158  INDEX_PYPI_DOWNLOAD = 50
159  INDEX_PYPI_DISTANCE = 5
160  INDEX_URL_DOWNLOAD  = 25
161  INDEX_UNPACK        = 25
162  INDEX_UNPACK_DIR    = 15
163  INDEX_INSTALL       = 50
164  INDEX_FILE_CRITICAL = 15
165  INDEX_FILE          = 10
166  INDEX_FILE_PYC      = 20
167  INDEX_DIR_CRITICAL  = 25
168  INDEX_DIR           = 20
169  INDEX_DIR_EMPTY     = 5
170
171  MAX_INDEX_DOCSTRINGS = 100 # max. percentage of modules/classes/methods/functions with docstrings
172  MAX_INDEX_PYLINT     = 100 # max. pylint score
173
174 **Step 0**
175
176 Initialize the Cheesecake index to 0. Also initialize to 0
177 the partial Cheesecake indexes for installability, documentation
178 and code kwalitee.
179
180 Compute the maximum overall Cheesecake index that can be reached by
181 any given package, which is the sum::
182
183  INDEX_PYPI_DOWNLOAD +
184  INDEX_UNPACK + INDEX_UNPACK_DIR +
185  INDEX_INSTALL +
186  MAX_INDEX_DOCSTRINGS + MAX_INDEX_PYLINT +
187  (INDEX_FILE * number_of_expected_files) +
188  (INDEX_FILE_CRITICAL * number_of_expected_critical_files) +
189  (INDEX_DIR * number_of_expected_dirs) +
190  (INDEX_DIR_CRITICAL * number_of_expected_critical_dirs)
191
192 Compute the maximum Cheesecake index for installability, which is the sum::
193
194  INDEX_PYPI_DOWNLOAD +
195  INDEX_UNPACK + INDEX_UNPACK_DIR +
196  INDEX_INSTALL
197
198 Compute the maximum Cheesecake index for documentation, which is the sum::
199
200  (INDEX_FILE * number_of_expected_files) +
201  (INDEX_FILE_CRITICAL * number_of_expected_critical_files) +
202  (INDEX_DIR * number_of_expected_dirs) +
203  (INDEX_DIR_CRITICAL * number_of_expected_critical_dirs) +
204  MAX_INDEX_DOCSTRINGS
205
206 Compute the maximum Cheesecake index for code kwalitee, which is currently::
207
208  MAX_INDEX_PYLINT
209
210 **Step 1a**
211
212 If short name of the package was specified with ``-n`` or ``--name``,
213 try to download the package from the PyPI index page by following the links to
214 the package home page and the package download URL (this is accomplished
215 using setuptools utilities).
216
217 If not successful, exit with a Cheesecake index of 0. If successful and
218 package was found at the Cheese Shop, add ``INDEX_PYPI_DOWNLOAD`` to
219 the overall Cheesecake index and to the installability Cheesecake index.
220
221 If successful but package was not found at the Cheese Shop, add
222 ``INDEX_PYPI_DOWNLOAD - (INDEX_PYPI_DISTANCE * number_of_links_to_package)``
223 to the overall Cheesecake index and to the installability Cheesecake index.
224
225 **Step 1b**
226
227 If full URL of the package was specified with ``-u`` or ``--url``,
228 try to download the package from the specified URL.
229
230 If not successful, exit with a Cheesecake index of 0. If successful,
231 add ``INDEX_URL_DOWNLOAD`` to the overall Cheesecake index and to
232 the installability Cheesecake index.
233
234 **Step 1c**
235
236 If path to package on local file system was specified with ``-p`` or
237 ``--path``, copy the package to the sandbox directory.
238
239 **Step 2**
240
241 Unpack the package (currently supported archive types are zip and
242 tar.gz/tgz; in the near future we will support Python Eggs.)
243
244 If not successful, exit with a Cheesecake index of 0. If successful, add
245 ``INDEX_UNPACK`` to the overall Cheesecake index and to the installability
246 Cheesecake index.
247
248 **Step 3**
249
250 Check that the unpack directory has the same name as the package name
251 (i.e. when unpacking twill-0.7.4.tar.gz, we expect the unpack directory
252 to be twill-0.7.4.)
253
254 If the unpack directory name is the same as the package name, add
255 ``INDEX_UNPACK_DIR``
256 to the overall Cheesecake index and to the installability Cheesecake index.
257
258 **Step 4**
259
260 Install the package to a temporary directory in a non-default location.
261 If successful, add ``INDEX_INSTALL`` to the overall Cheesecake index and to the
262 installability Cheesecake index.
263
264 **Step 5**
265
266 Check for existence of specific files.
267 For each file found, add ``INDEX_FILE`` to the overall
268 Cheesecake index and to the documentation Cheesecake index.
269 If the file is deemed critical, add ``INDEX_FILE_CRITICAL`` instead.
270
271 The following special files ("cheese_files") are currently checked::
272
273     cheese_files = ["install", "changelog",
274                     "news", "faq",
275                     "todo", "thanks", "announce",
276                     "ez_setup.py",
277                    ]
278
279 The following files are currently deemed critical::
280
281     critical_cheese_files = ["readme", "license", "setup.py"]
282
283 To check if a file FILE is among the cheese files, the following regular
284 expression is used::
285
286     re.search(r"^%s(\.txt)*" % cheese_file, file, re.IGNORECASE)
287
288 **Step 6**
289
290 Check for existence of specific directories.
291 For each directory found, add ``INDEX_DIR`` to the overall Cheesecake
292 index and to the documentation Cheesecake index.
293 If the directory is deemed critical, add ``INDEX_DIR_CRITICAL`` instead.
294 If the directory is found empty, add ``INDEX_DIR_EMPTY`` instead.
295
296 The following directories ("cheese_dirs") are currently checked::
297
298     cheese_dirs = ["example", "demo"]
299
300 The following directories are currently deemed critical::
301
302     critical_cheese_dirs = ["doc", "test"]
303
304 To check if a directory DIR is among the cheese directories,
305 the following regular expression is used::
306
307     re.search(r"^%s" % cheese_dir, DIR, re.ignorecase)
308
309 **Step 7**
310
311 Check for existence of .pyc files. If found, decrease the score
312 by subtracting ``INDEX_FILE_PYC`` from the overall Cheesecake index
313 and from the documentation Cheesecake index.
314
315 **Step 8**
316
317 Compute the percentage of modules/classes/methods/functions that have
318 docstrings associated with them. Only Python modules that are not in test,
319 doc, demo and example directories are checked.
320 Round up the percentage and add it to the overall Cheesecake index and to the
321 documentation Cheesecake index.
322
323 **Step 9**
324
325 If pylint is present on the system, run pylint against all Python files
326 that are not in the test, docs or demo directories.
327 Average the non-negative pylint scores, multiply the average by 10 and
328 add it to the overall Cheesecake index and to the code kwalitee
329 Cheesecake index.
330  
331 **Step 10**
332
333 For each of the partial Cheesecake index types (installability,
334 documentation and code kwalitee), display the absolute Cheesecake
335 index for that type as the sum of all indexes of that type computed in
336 the previous steps.
337 Also display the relative Cheesecake index for that type as the percentage
338 of ``(absolute_index / maximum_index)``.
339
340 Display the absolute Cheesecake index for the package as the sum of all
341 indexes computed in the previous steps. Also display the relative Cheesecake
342 index for the package as the percentage of ``(absolute_index / maximum_index)``.
343
344 Sample output
345 -------------
346
347 ::
348
349  $ python cheesecake.py -n Durus
350  [cheesecake:console] Trying to download package durus from PyPI using setuptools utilities
351  [cheesecake:console] Downloaded package Durus-3.1.tar.gz from http://www.mems-exchange.org/software/durus/Durus-3.1.tar.gz
352  [cheesecake:console] Detailed info available in log file /tmp/cheesecake_sandbox/durus.log
353  [cheesecake:console] A given package can currently reach a MAXIMUM number of 555 points
354  [cheesecake:console] Starting computation of Cheesecake index for package 'Durus-3.1.tar.gz'
355
356  [cheesecake:console] Starting computation of INSTALLABILITY index (max. points = 140)
357  index_pypi_download .....................  45 (downloaded package Durus-3.1.tar.gz following 1 link from PyPI)
358  index_unpack ............................  25 (package untar-ed successfully)
359  index_unpack_dir ........................  15 (unpack directory is Durus-3.1 as expected)
360  index_install ...........................  50 (package installed in /tmp/cheesecake_sandbox/tmp_install_Durus-3.1)
361  ---------------------------------------------
362  INSTALLABILITY INDEX (ABSOLUTE) ......... 135
363  INSTALLABILITY INDEX (RELATIVE) .........  96 (135 out of a maximum of 140 points is 96%)
364
365  [cheesecake:console] Starting computation of DOCUMENTATION index (max. points = 415)
366  index_file_announce .....................   0 (file not found)
367  index_file_changelog ....................   0 (file not found)
368  index_file_ez_setup.py ..................   0 (file not found)
369  index_file_faq ..........................  10 (file found)
370  index_file_install ......................  10 (file found)
371  index_file_license ......................  15 (critical file found)
372  index_file_news .........................   0 (file not found)
373  index_file_readme .......................  15 (critical file found)
374  index_file_setup.py .....................  15 (critical file found)
375  index_file_thanks .......................   0 (file not found)
376  index_file_todo .........................   0 (file not found)
377  index_dir_demo ..........................   0 (directory not found)
378  index_dir_doc ...........................  25 (critical directory found)
379  index_dir_example .......................   0 (directory not found)
380  index_dir_test ..........................  25 (critical directory found)
381  index_docstrings ........................  42 (found 104/249=41.77% modules/classes/methods/functions with docstrings)
382  ---------------------------------------------
383  DOCUMENTATION INDEX (ABSOLUTE) .......... 157
384  DOCUMENTATION INDEX (RELATIVE) ..........  37 (157 out of a maximum of 415 points is 37%)
385  
386  [cheesecake:console] Starting computation of CODE KWALITEE index (max. points = 100)
387  index_pylint ............................  64 (average score is 6.30 out of 10)
388  ---------------------------------------------
389  CODE KWALITEE INDEX (ABSOLUTE) ..........  64
390  CODE KWALITEE INDEX (RELATIVE) ..........  64 (64 out of a maximum of 100 points is 64%)
391  
392  =============================================
393  OVERALL CHEESECAKE INDEX (ABSOLUTE) ..... 356
394  OVERALL CHEESECAKE INDEX (RELATIVE) .....  64 (356 out of a maximum of 555 points is 64%)
395
396 Future plans
397 ------------
398 Cheesecake is under very active development. The immediate goal is to add the unit test
399 index measurement, followed by other metrics inspired from the
400 `kwalitee indicators <http://cpants.dev.zsi.at/kwalitee.html>`_.
401 Please edit the `IndexMeasurementIdeas <http://pycheesecake.org/wiki/IndexMeasurementIdeas>`_
402 Wiki page to add things that you would like to see covered
403 by the Cheesecake metrics.
404
405 .. footer:: Last modified 2006-05-25 by `Michal Kwiatkowski <http://joker.linuxstuff.pl>`_.
Note: See TracBrowser for help on using the browser.