The short-form week of 19–25 Nov 2012
26 Nov 2012
The week in review, 140 characters at a time.
This document was created automatically from my archive of my Twitter stream. Due to limitations in the Twitter API and occasional glitches in my archiving system, it may not be 100% complete.
In a conversation that started on Tuesday at 09:53am
Programming is a race between programmers building better idiot-proof programs and
the Universe producing better idiots http://t.co/xIy0LMXU—@OlivierCroisier
@OlivierCroisier smart money is on the universe.—@ndw
Monday at 09:29am
There is no video I can recommend more today https://t.co/i8MyCLqO—@shelleypowers
Monday at 10:13am
Starting to encounter apps that require a more recent OS X than 10.6. Shame about
that.—@ndw
Monday at 10:32am
“Slow down when road is wet” #absurdtrafficsigns—@ndw
Monday at 11:06am
If you make me change my password then mail my password to me in plain text, you are
in need of a clue.—@ndw
In a conversation that started on Monday at 12:25pm
Why shopping makes introverts psycho. http://t.co/j1hnhvHK—@trieloff
Monday at 12:54pm
If there's one thing we don't need, it's another US politician who talks about "disputes
among theologians" over scientific questions.—@kendall
In a conversation that started on Monday at 03:10pm
In a conversation that started on Monday at 05:09pm
OS X “Mountain Lion” FileValue 2 “Whole Disk Encryption”. What sayest thou?—@ndw
@ndw works for me—@adamretter
.@ndw There's a reason it's nicknamed "Vile Fault"...—@nik_clayton
@nik_clayton The alternative is to fork over $114 and upgrade my Symantec PGP WDE license to 10.2.
But most reports of FileVault have been +—@ndw
Monday at 05:42pm
Reflecting that we cannot escape time. "This thought is as a death, which cannot choose;
But weep to have that which it fears to lose."—@johnlsheridan
Monday at 05:43pm
The quote in the last tweet was from Shakespeare, by the way.—@johnlsheridan
In a conversation that started on Monday at 07:16pm
Cardinal Dolan really should burn in hell: http://t.co/L05X5UT6—@kendall
Monday at 07:33pm
In the future, the only way to prove you're not a cybernetic organism will be to violate
Asimov's first law.—@natesbrain
In a conversation that started on Monday at 08:21pm
I’m not sure why I get spam for apartment rentals in Ulannbaatar, but it is without
question the most tempting spam I ever receive.—@ndw
Monday at 08:43pm
The power of wetware: only about 1 spam message in 1,000 have to be opened; 999 can
be deduced from the subject alone.—@ndw
Tuesday at 05:41am
There can never be a bacon overflow error. There can only be an out of bacon error.—@CthonicSoftware
In a conversation that started on Tuesday at 08:45am
In a conversation that started on Tuesday at 08:51am
Is there an XQuery DB that can parse & store HTML using proper HTML5 parsing? I see
Zorba has something, but it's Tidy, not real parsing.—@robinberjon
@robinberjon we also just have tidy. What/why you trying to do?—@adamretter
@adamretter But I need it properly parsed as HTML. Tidy will change the syntax and parse as XML,
so the Infoset will be different.—@robinberjon
@robinberjon I'm a little ignorant of html5 parsing, can you provide a simple html5 example where
an xpath result would be wrong after tidy?—@adamretter
@adamretter E.g. namespace declarations shouldn't be namespace declarations. See tweetthread
with @ndw—@robinberjon
@robinberjon @ndw I could parse with nuvalidator html5 library and then store in eXists persistent
DOM easily enough. Would give you xquery—@adamretter
@adamretter Aha, that's a lead. But as @ndw indicates, would the XQ then accept the likes of fn:id("42")?—@robinberjon
@robinberjon @adamretter Yes, probably. XQuery is reasonably relaxed about ID values, actually. But “/fb:foo”
is still going to be a QName.—@ndw
@ndw @robinberjon I rather wondered if the html5 parser would drop those pesky xmlns:fb decls—@adamretter
@ndw That's fine so long as you can fn:QName("", "xmlns:fb") and such — you can always
compare with /*[…] instead @adamretter—@robinberjon
@ndw In other words, having to be careful with some queries isn't a problem so long as
it doesn't barf on weird constructs. @adamretter—@robinberjon
@robinberjon It’ll barf on some, like xs:QName(“”,”foo:bar”), but you can probably work around
most of them. Example of interesting query?—@ndw
@ndw That might trip it or in general?—@robinberjon
@robinberjon In general. What’s an example of a query you’d like to run efficiently on the corpus.—@ndw
@ndw I'm trying to decide if alt text is used as replacement (as claimed in HTML5) or
as a meta description (as claimed by WCAG).—@robinberjon
@robinberjon @ndw Good luck, this seems as easy decision as solving NP-complete problem :-)—@jirkakosek
@jirkakosek LOL :) I'm just generating the report — for the NLP parts I'm using a human being.
@ndw—@robinberjon
@ndw Here's the relevant snippet from the JS I'm using https://t.co/vPScgPoX—@robinberjon
@ndw A simple example from right now: show all img alt text highlighted in the context
of surrounding text (if any).—@robinberjon
@robinberjon Ok.—@ndw
@ndw Where surrounding text is any text before or after the img inside a block-level parent.—@robinberjon
@robinberjon XML Calabash uses Henri’s parser; you could go from HTML to XML that way and then
use, uh, for example, http://t.co/fArHsI26—@ndw
@ndw i.e. I need to query the real Infoset one gets from HTML parsing; that doesn't always
round-trip through XHTML.—@robinberjon
@ndw Would it be able to store to MarkLogic without serialising to XHTML in the middle?
Because that won't work, sadly.—@robinberjon
@robinberjon It’ll be serialized as XML in the middle, not particularly XHTML. What are you afraid
will get lost, exactly?—@ndw
@ndw E.g. I see a bunch of xmlns:fb in the corpus from people who use Facebook's scripts.
That must not be parsed as an NS declaration.—@robinberjon
@robinberjon Uhm, I think you’re SOL. XQuery operates on instances of the XPath Data Model. That
HTML is broken wrt namespaces is a shame.—@ndw
@ndw Ah, indeed, I'm not ever going to get local names of xmlns:fb am I. *sigh* Someone
should make a tool that properly handles HTML *hint*—@robinberjon
@robinberjon Or HTML could…no, @al3xbrown is right. I’m going to shut up too. Or start cursing.—@ndw
@ndw I'm afraid that ship has sailed a long, long time ago :) I'm just in the trenches
trying to marry good data to good tools! @al3xbrown—@robinberjon
@al3xbrown Haha, I doubt all the power in the world could update that much legacy content :)
@ndw—@robinberjon
@ndw For a subset of queries it would be good though — I just need to mull over whether
it's large enough.—@robinberjon
@robinberjon If you point me to the corpus, I’ll see what I can do over the weekend.—@ndw
@ndw I'm operating on http://t.co/XTcsvr55 I reckon I might be able to use ML with just the NS limitation. Pondering that. Thoughts
welcome!—@robinberjon
@ndw Oh, btw, it's a real world corpus of the top 10K sites so be warned that not all
of it is, erm, necessarily tasteful.—@robinberjon
@robinberjon *snort* I’ll try not to blush.—@ndw
@ndw I *wonder* if there's a business in indexing HTML documents (properly) with powerful
querying abilities…—@robinberjon
@robinberjon *shrug* You’d need a query language for it and such. Seems unlikely to me, but my
crystal ball is uniformly opaque.—@ndw
@ndw @robinberjon Guys, you still have 10 days to draft necessary changes and submit it for #xmlprague. Then 2 months for implementation :-)—@jirkakosek
@ndw XQuery could work just fine with the same relaxations applied I reckon.—@robinberjon
@robinberjon how much are you prepared to pay for queries over serious indexes of HTML documents?
cc @ndw—@chaals
@chaals I don't know, I've always sucked at setting prices. I'd use it on and off but regularly.
Yesterday it cost me 5 hours. @ndw—@robinberjon
@ndw So I'm thinking there's no reason there couldn't be a "loose" XDM mode that could
handle HTML as such.—@robinberjon
@ndw Thinking about it, I reckon most/all of the XDM checks that make processing HTML
impossible aren't needed for optimisation, right?—@robinberjon
@robinberjon Seems likely to me. The only problem will be queries that rely on the broken^H^H^H^H^H^Hnon-XML
nature of some HTML documents.—@ndw
@robinberjon @ndw okay thanks Robin - I don't think I can tweet anything constructive wrt this, so
I'll shut up now :-)—@al3xbrown
Tuesday at 09:01am
Perform a google maps search on my laptop. Next time I look at my phone, it’s showing
me an ETA to there. That’s cool. Creepy too, but cool.—@ndw
Tuesday at 09:02am
Oh, look, the App Store has put up a dialog that won’t go away. App Store, how I loathe
you in concept and execution.—@ndw
Tuesday at 09:11am
Tuesday at 09:24am
In a conversation that started on Tuesday at 09:39am
.@robinberjon @ndw "[HTML 5 infoset] doesn't always round-tripped through XHTML" <-- what?!—@al3xbrown
@al3xbrown There will be small but real differences with script content and xmlns attributes
(which I do see in the corpus). @ndw—@robinberjon
In a conversation that started on Tuesday at 12:23pm
@ndw @robinberjon This is late and probably ill informed, but why not preprocess the HTML as XML and
XSL all the fb:* to fb-*?—@fidothe
@fidothe That was just one example, there are many things that simply don't map very well
to XML at all. @ndw—@robinberjon
Tuesday at 01:39pm
WOW. Who'd have thought a 470 year old institution founded solely to facilitate a
King's philandering would be so backward and misogynist?—@MitchBenn
In a conversation that started on Tuesday at 03:47pm
Well said RT @arosenbaum: @dscape @MarkLogic CJL! Who cares about the title. Lack of technical leadership has never been MarkLogic's
problem—@JoshNarva
@JoshNarva @arosenbaum i disagree — if the technology is so awesome then how come no ones uses it? cause
of wrong technology choices 4 devs—@dscape
Tuesday at 05:30pm
Past, Present and Future walked into a bar. It was tense.—@BadJokeCat
In a conversation that started on Tuesday at 06:01pm
Christian Taliban in the House! http://t.co/RdvWjrho—@kendall
Tuesday at 06:28pm
Blah blah Jesus blah blah talking snakes blah teach the controversy blah why can't
Johnny do math or get a good job?—@kendall
Tuesday at 06:49pm
In a conversation that started on Tuesday at 08:51pm
Update: my office smells like fries and discarded sandwich bits—@JoshNarva
In a conversation that started on Tuesday at 11:04pm
It's funny when people think nothing can have value unless the is a god. That must
be the dumbest thing ever said.—@kendall
Wednesday at 12:34am
Ok. Reserva de la Familia is just damned fine. http://t.co/Cj9TcLJ8—@ndw
Wednesday at 03:13am
The official twitter hashtag for Susan Boyle’s new release is #susanalbumparty Her social media team are either very naiive or geniuses.—@pjstead
In a conversation that started on Wednesday at 04:00am
If you've tried writing (code, articles) on one of the 7" tablets I'd be interested
in hearing your experience.—@robinberjon
@robinberjon It’s significantly better than a 4” phone. With a real keyboard, I think it’d be
quite practical for prose; not sure about code—@ndw
@ndw Ah, thanks for the input — you're the first positive feedback so far :)—@robinberjon
In a conversation that started on Wednesday at 04:37am
*sigh* KeepNote is nice, but has too many issues and missing features for me; major
(for me): no support for numbered lists—@marjoleink
@marjoleink Not sure exactly what you’re looking for but Emacs org-mode is awfully good.—@ndw
In a conversation that started on Wednesday at 09:12am
Hmmm. A petition to get the FCC to revoke Murdoch’s broadcast license for Fox News.
I don’t know about that.—@ndw
Wednesday at 09:15am
Wednesday at 02:25pm
Ugh. Importing my calendar in 10.8 has introduced a whole raft of appointments from
some other random calendar that I can’t delete.—@ndw
Wednesday at 04:38pm
Oh, look. Importing a calendar exported from iCal 10.6 causes Calendar 10.8 to crash.
Lovely.—@ndw
In a conversation that started on Wednesday at 08:02pm
OMFG! The skeuomorphic calendar and contacts are <maximum-emphasis>awful</maximum-emphasis>!—@ndw
Wednesday at 08:03pm
In a conversation that started on Wednesday at 08:37pm
No second external monitor for me. DisplayLink isn’t supporting 10.8 yet. :-/—@ndw
In a conversation that started on Wednesday at 09:27pm
I live in the south now (oh, shut up). That means Pecan Pie instead of Pumpkin. Heathens.
And yet, I claim #winhttp://t.co/xAIYipzm—@ndw
@ndw Welcome to the South! Most Southern cooks can prepare "foreign" dishes. Just have
to ask. ;-)—@patrickDurusau
@ndw The north officially approves of Pecan, too. Oh wait. His grandma was southern...
But it's PIE! :) HTG, Norm.—@JeanKaplansky
Thursday at 01:04am
Knocking on wood & getting no reply. http://t.co/W87Q1QX2—@ndw
Thursday at 02:44am
"@ndw: I live in the south now. That means Pecan Pie instead of Pumpkin. Heathens. http://t.co/MWFVGw6y" corrctn: means pumpkin AND pecan.—@aljopainter
Thursday at 02:47am
@ndw Furthermore, people in Texas don't say "I live in the South." They out 'n say it:
"I'm from Texas!"—@aljopainter
Thursday at 03:08am
@ndw Old saw: "I never ask a fella where he's from. If he's from Texas, he'll out 'n tell
me. If he ain't, well, don't wanna embarass him."—@aljopainter
Thursday at 07:03am
Friday at 04:24pm
#FF recent @EricIdle @ljmullinsworld @pritheworld @sullydish @Green_Footballs @ndw @bhobservatory @bfdradio @webmink @Mass_Inc_Paging—@n1vux
In a conversation that started on Friday at 07:22pm
In a conversation that started on Friday at 08:10pm
$DIETY protect me, I will now attempt to setup syncing between my OS X 10.8 Contacts
and Calendar and Google contacts and calendar.—@ndw
Friday at 09:00pm
RT @n1vux RT @ndw: OMFG! @stephenfry builds a medieval printing press. http://t.co/NaKYHlJo #squee—@Squ_ee
Friday at 09:23pm
First level approximation of the status of syncing my contacts and calendar data with
Google: I’m fucked.—@ndw
In a conversation that started on Friday at 11:40pm
“Bet you $50 it hasn’t.” http://t.co/Hmy5GNIN—@ndw
Saturday at 04:15am
In a conversation that started on Saturday at 12:00pm
To the woman with the "biohazard" symbol tattoo visible in her cleavage: what!?—@ndw
@ndw Women are indeed dangerous?—@bortzmeyer
Saturday at 12:37pm
Black Friday was America saying "Yeah, I know yesterday I said I was grateful for
what I have, but today I want a lot more and for less."—@rickygervais
Saturday at 03:05pm
"I don't pay good wages because I have a lot of money; I have a lot of money because
I pay good wages." ~Robert Bosch #business #leadership—@ManagersDiary
In a conversation that started on Saturday at 08:14pm
Rasterized, circa 1983: http://t.co/sCVFviHo—@ndw
@ndw surprisingly good quality :-)—@adamretter
Saturday at 08:20pm
Neighbor is not home. Playing The Clash unreasonably loud.—@ndw
Saturday at 08:22pm
On the assumption that it’s foolish to leave any stone unturned, wanted: 1 (or 2)
br appt (duplex pref) in Austin, TX. Non smoking, no pets.—@ndw