GSoC final report

August 23, 2011 1 comment

Hello everyone,

This is the final report for the Package statistics project.

Homepage : https://soc.dev.gentoo.org/gentoostats/

Repository : http://git.overlays.gentoo.org/gitweb/?p=proj/gentoostats.git

Summary

The goal of this project is to implement a client-server architecture for reporting and querying package statistics of Gentoo based machines. The client program will be used to collect package statistics from Gentoo installations and submit them to a central server. The server will calculate useful statistics based on the global dataset, that developers as well as end users have access to, via an intuitive web interface.

Detailed summary

The gentoostats project consists mainly of three components:

  • https://soc.dev.gentoo.org/gentoostats/ : The webapp which collects data submitted by clients and renders the required stats.
  • gentoostats-send : The script which reads portage and package data and submits them to the server.
  • gentoostats-cli : The script which talks to the gentoostats webapp via a RESTful api, and reads and displays stats.

As of the “pencils down” date, all of the above components are working, and quite a lot of stats are rendered successfully. Of course, I have dropped some features from my original proposal, but also added some. Besides this, I also wrote some patches to packages.gentoo.org, though they haven’t been merged yet.

Future plans

I am looking forward to continue working on and improving this project. Besides, I would very much like to join the community as a gentoo dev.

Some possible future goals are :

* The webUI is fugly at this point, mostly because I suck at web designing. It could be improved a lot, using the underlying json api.

* Portage gui apps could be patched to use stats from the webapp.

* A popular request for stats is adding “installed files” to the stats. This requires an ingenious solution since the dataset is huge.

Thanks

Out of the top of my head, I would like to thank antarus, dberkholz, robbat2, the infra team, #gentoo-portage, #gentoo-dev-help, #gentoo-soc, without whom this SoC wouldn’t have been a success.

Categories: gentoo, gsoc, planet-gentoo

GSoC weekly report #7

August 8, 2011 Leave a comment

Hello people,

The package statistics project is alive and in progress. Over the last weeks, I worked on the following:

  • gentoostats-cli : A cli interface to the gentoostats webapp. You can do operations like :
    • gentoostats-cli list repo
    • gentoostats-cli list package –category=sys-devel –package=gcc
    • gentoostats-cli search –package=python –min_hosts=10
  • Better pages for the webUI, adding forms for package search etc.

I’m pretty much approaching the end of my project, and need some more ideas/feature suggestions. Some feedback would be helpful.

Some stuff I have in mind for the next few days:

  • Packages/Users vs Categories/Packages plots
  • Testing against various security risks
  • An ebuild for the server
Categories: gentoo, gsoc, planet-gentoo

GSoC weekly report #6

August 8, 2011 Leave a comment

This is archived from the email sent to the mailing list, since I had internet troubles during the time.

Hi everyone,

Over the past weeks, I worked on adding a json api to packages.gentoo.org, to access all packages in the portage tree. The patches are ready, but not yet deployed.

Currently I’m working on a cli client for the gentoostats webapp. It should basically provide all the functionality of the webapp (refer to midterm report).

Categories: gentoo, gsoc, planet-gentoo

GSoC midterm report

July 14, 2011 8 comments

Hi all,

Welcome to the midterm report of the ‘Package statistics’ project.

Summary

The goal of this project is to implement a client-server architecture

for reporting and querying package statistics of Gentoo based machines.
The client program will be used to collect package statistics from
Gentoo installations and submit them to a central server. The server
will calculate useful statistics based on the global dataset, that
developers as well as end users have access to, via an intuitive web
interface.

For the past few days, I’ve been working on the webUI, adding pages for stats.
We’ve also managed to get the webapp running (finally :D ) on vulture. Thanks
to my mentor antarus, robbat2, and the rest of the infra team for helping out.
We hit a few snags, but managed to ease them out in the end. Also, apologies for
making a stupid mistake of committing my mysql password to git (:P).

What works

  • Submitting host stats using a client script
  • Accessing host stats at /host/≤uuid>
  • Arch stats: /arch
  • Package stats:
    • /package/<category>
    • /package/<category>/<pkgname>
    • /package/<category>/<pkgname>-<version>
      (An optional ?top=N can be added to the url for the no. of top items)
  • Repository stats: /repo
  • Keyword stats: /keyword
  • Useflag stats:
    • /use
    • /use/≤useflag>
  • Portage FEATURES stats: /feature
  • Language stats: /lang
  • Mirror stats: /mirror
  • Profile stats: /profile

What doesn’t work (yet)

  • Package search
  • Rating of packages
  • Graphs
  • Bugzilla, tinderbox integration
  • Export the stats to JSON

What needs work

  • The webUI should be prettier
  • The repository and useflag stats could be improved

I think I can finish the remaining goals in another 2-3 weeks. After that, I’ll consider working on some of my stretch goals.

I’m also working on improving the packages.gentoo.org api, so that there’s an easy way to access the portage tree state, and enrich the package stats.

Help me out by submitting your stats to the server. An ebuild for the client is available in the repo. Please report bugs, exceptions etc.

Got any feature suggestions/ideas ?

Categories: gentoo, gsoc, planet-gentoo

GSoC weekly report #5

June 30, 2011 Leave a comment

Hi everyone,

This is my fifth weekly report on the progress of my GSoC project.

Progress during the last week:

I’m getting the webapp ready to be deployed on vulture.

  • Wrote tests for the server
  • Update the code to use HTTPS rather than HTTP
  • Deployed the webapp locally using apache and mod_wsgi, configs are attached on bugzilla

Issues:

  • I had mentioned compression as one of my goals in my last progress report. But after discussion, I’m postponing this goal until a beta server is up and running (since it could be done internally, or using mod_deflate on apache).

Goals for next week(s):

I’m going to concentrate mostly on the webUI now, adding more features, fixing up the webpages etc.

  • Improve/fix the stats pages
  • Allow user login/session management etc.
Categories: gentoo, gsoc, planet-gentoo

GSoC weekly report #4

June 23, 2011 1 comment

Hi everyone,

This is my fourth weekly report on the progress of my GSoC project. It’s been a little slow last week since I had been travelling.

Progress during the last week:

  • Wrote ebuild for the client
  • Auth info for the host is read from a config file now
  • Implemented a config file feature for the user to mask reported fields
  • Worked on pages for per-package and per-arch stats

Issues:

  • Payload compression : The client data sent could be compressed to improve post time. This could be done by gzip compression of the payload (authentication info should be separated from the payload then), or by using a transparent gzip reverse proxy with apache.
  • HTTPS : It was suggested to send the data over HTTPS for better security. This too, could be implemented using reverse proxies.

Blockers:

  • Still blocked on bug 369679 to deploy the webapp

Goals for next week(s):

  • Work on the above issues
  • Continue work on the WebUI
  • Write some tests for the client
Categories: gentoo, gsoc, planet-gentoo

GSoC weekly report #3

June 15, 2011 Leave a comment

Hi everyone,

This is my third weekly report on the progress of my GSoC project.

Progress during the last week:

  • Created mockups for the WebUI
  • Reworked the server code to be modular
  • Wrote a page to display per host statistics
  • Wrote a setup.py script to install the client
  • Deployed the webapp locally using mod_wsgi

Issues:

Goals for next week(s):

  • Write ebuilds for the client/server
  • Continue writing the stats pages based on the mockups
  • Write some tests for the client
Categories: gentoo, gsoc, planet-gentoo
Follow

Get every new post delivered to your Inbox.