[p2pu-dev] Course metrics

Jos Flores josmasflores at gmail.com
Tue Jun 19 17:46:45 UTC 2012


Hey Dirk,

I don't think getting data up to 'yesterday' is an issue. Overhead on
page views sounds really ugly.

Would it be very difficult to put all the graphics in a pdf, and send
both pdf and csv to the organiser? This, and a lock so that it does
not duplicate data, would make me happy because I don't have to wait,
just say I want them and when they are ready, they will be in my
inbox.

Is the process CPU intensive or is the DB slowing things down? Could
this be part of the API and run in a different machine? This is a bit
out now, but sounds like a perfect job for a nodejs stream and a
document DB, if it's not CPU bound.

cheers,
José


On 19 June 2012 17:07, Dirk Uys <dirk at p2pu.org> wrote:
> Hi everyone
>
> During the last release on 18 May I enabled course metrics for all course
> organizers believing that the metrics were working perfectly and that it's
> simply a permission update. You know what they say about assumption...
>
> The problem is that when a user goes to the metric page for a course the
> metrics get generated from the recorded page views
> (https://github.com/p2pu/lernanta/blob/master/lernanta/apps/tracker/models.py#L100).
> If the user refreshes the page (because it's taking so long), the process is
> started again and the metric updating procedure happens concurrently. This
> doesn't play nice with the intended use of the db and duplicated data is
> generated :(
>
> Now, solving this problem has multiple possibilities! Each with pros and
> cons.
>
> 1. Enforce some locking mechanism to ensure the operation only happens once
> + process doesn't run concurrently
> - user waits
> - lots of db work tied to specific requests
>
> 2. Queue a celery tasks that runs to operation
> + user doesn't need to wait for results
> - still need to implement some locking mechanism to prevent celery tasks
> from running concurrently
> - lots of db work tied to specific requests
>
> 3. Keep the table updated from the get go
> + metrics are always up to date
> - introduces small overhead to every page view
> - generate metrics that's never used
>
> 4. Fix the data duplication issue that presents itself
> + doesn't matter if process runs concurrently
> - update still takes a long time
> - lots of db work tied to specific requests
>
> 5. Don't trigger the update process based on user actions, but rather at a
> predetermined time
> + user doesn't wait
> - generate metrics that's never used
>
> 6. ?
>
> Does anyone have any thoughts on this?
>
> Cheers
> d
>
> _______________________________________________
> p2pu-dev mailing list
> p2pu-dev at lists.p2pu.org
> http://lists.p2pu.org/mailman/listinfo/p2pu-dev
>


More information about the p2pu-dev mailing list