« Port | Main | Fully Patched »

23 August 2004

My Data I Wanna

This blog is powered by COREBlog, a Zope product. Blog entries are persistent Python objects stored in Zope's ZODB.

Let's say I wish to keep a copy of each entry in a separate file in the local filesystem.

One way is to fetch it through the web:

$ curl -o 1.html http://sandbox.rulemaker.net/ngps/1

This saves blog entry '1' into the file 1.html.

To get the lot:

$ for ((i=1; i<=$last; i++))
do
curl -o $i.html http://sandbox.rulemaker.net/ngps/$i
done
$

Each html file contains the rendered entry, complete with headers, footers and Google ad. More work is needed if I want just the entry body itself, such as parsing the HTML, or crafting a DTML method or Python script within Zope to display just the body.

Another way is to read it from (a copy of) the ZODB. On my desktop, I have a local Zope installed into ~/pkg/zope. First I get my blog entries into this Zope, either by importing from the live Zope, or by copying over the entire ZODB file itself.

Then, invoke Python:

$ python2.1
Python 2.1.3 (#2, Jun 10 2003, 00:24:34) 
[GCC 2.95.4 20020320 [FreeBSD]] on freebsd4
Type "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.path.insert(0,'/home/ngps/pkg/zope/lib/python')
>>> import Zope
>>> root = Zope.app()

Now 'root' points to Zope's root folder. Traverse to the blog:

>>> coreblog = root['sandbox.rulemaker.net']['ngps']['blog']
>>> coreblog
<COREBlog instance at 91c7880>
>>> e = coreblog.entries[1]
>>> e
<Entry instance at 91cd280>
>>> e.title
'Winds Of Change'
>>> e.body   
"So, I have decided this web site needs some reworking.\n\n..."
>>> e.created
1078502400
>>> import time
>>> time.ctime(e.created)
'Sat Mar  6 00:00:00 2004'

I have the title, the body, the body's format - either plain text, StructuredText or HTML - and the timestamp. Looks like the lot. From here it is just a little more work to package the above into a program that extracts all the entries, comments and trackbacks.

The above cannot be done while that particular ZODB is running, i.e., in use by Zope. If my live, Internet-facing Zope is in a Zope Enterprise Objects (ZEO) set up, then I can, by acting as a ZEO client to it. This is similar to having REPL access to a Common Lisp web server.


Posted by ngps at 16:55 | Comments (4) | Trackbacks (0)
Comments
Re: My Data I Wanna

Excellent!

Coreblog smells a lot like Squishdot..wonder how well this translated.
Here's a way I dumped Squishdot entries to a Zwiki page w/ dtml:
http://nomad.freezope.org/weblog/1023631764/
which could then be curled.

/surprised you're not listed on the coreblog site

Posted by: DeanG at August 23,2004 23:49
OT: howto.ca

hey,
sorry to bother you, but i was wondering wether your quite helpful howto.ca.html document is still accessible online? i was able to grab a cached version from google, but for future references i would prefer to use the original [your] site. is it still online? thnx, -folkert.

Posted by: folkert at August 25,2004 19:04
Re: My Data I Wanna

http://sandbox.rulemaker.net/ngps/m2/howto.ca.html

Posted by: Ng Pheng Siong at August 26,2004 00:03
thnx a lot.

nice calligraphics, btw :)

Posted by: folkert at August 26,2004 00:44
Trackbacks
Please send trackback to:http://sandbox.rulemaker.net/ngps/97/tbping
There is no trackback.