Archive channel tree command [DRAFT]#2654
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2654 +/- ##
=======================================
Coverage 85.39% 85.39%
=======================================
Files 298 298
Lines 15767 15767
=======================================
Hits 13465 13465
Misses 2302 2302 Continue to review full report at Codecov.
|
|
What's needed to help push this forward, @ivanistheone? |
This is minimal additions to make sure JSON archive format really works with treediffer preset="studio" defined in https://github.com/learningequality/treediffer/blob/master/src/treediffer/presets.py#L39-L80
7fefb6c to
a3f3cdb
Compare
|
For context this PR was due to a misunderstanding on my part—when I head Jordan was working on channel diff, I rushed to get archive channel command and associated detailed diff code ready so she could use it, but then I realized "channel diff" meant just the simpler "channel counts diff" and detailed diff wasn't in scope, hence the pause on it. That being said, it would be a good to start archiving channels data, even if no frontend for these yet. @rtibbles Here is a mini-list of possible next steps:
Other related dev work:
I'm a bit out of the loop so cannot speak as to priority/timeframes, but happy to help out in free time on B. after A. (confirm this mgmt command is needed). |
Use casesThese were discussed a bit with Jordan and @kollivier as useful, but not sure if/when they would fit in roadmap: 1/ channeldiff task + commandSee standalone POC command-line code for this here: treediffer/examples/studiodiffferpoc.py 2/ channeldiff UIrun channeldiff task, then 3/ archivalNot sure if need to tackle that right now since requires consideration about scalability + long term user data retention. Would be nice to have a combined command archivechannel that does both archivechanneltree and archivechanneldb.
4/ PUBLISH/EXPORT Koibri DB from studio JSON archive treeInstead of export.py being based on direct access to DB; Kolibri-DB creation can be an independent task with input studio_tree_archive.json --> Kolibri DB (plus perseus files get if needed).
5/ content provenanceAll the expensive "graph analytics" like which channel imports from can be done easily based on channel archives json 6/ ROC data importerNot needed for ROC prototype, but good to have full Studio data (including provenance) |

Description
This is a POC for "channel archiving" command that exports the complete channel tree as JSON.
Steps to Test
./contentcuration/manage.py archivechanneltree {channel_id}for a{channel_id}that exists in the local DB.Implementation Notes
At a high level, how did you implement this?
archive_channel_tree(channel_id, tree='main')incontentcuration/contentcuration/utils/archive.pyarchivechanneltreethat calls this function.Does this introduce any tech-debt items?
Since we're using a new serializer for this task, the fields of that serializer would have to be kept up to data as Studio data models evolve.
Checklist
Comments
This is strictly POC and not finished; would need to be continued in order make sure channel archives contain all the info needed for all possible use cases (e.g. is info enough to "restore" a channel from archive?).
Reviewers
exportchannelcommand) from the need to access studio DB (assuming all the necessary info is present in the archived