·  [ martin@martindengler.com ]  ·  Home  ·  Projects  ·  Links  ·  Contact  ·  About  ·  PGP/GPG key  ·

 Martin T Dengler's personal home page. Nothing at all related to my code, my work, my university, or my high school. Old CV (pdf / tex / docx / txt). I administer this site, as well as xades.com, marydengler.com, and canodog.com. 2014-03-12 :   Scoring based on precision and recall   When scoring classifiers one wants to reward both precision and recall. A simple and useful linear function of precision and recall is just 2 * TP / ( 2 * TP + FP + FN) , where TP is the number of true positives, FP is the number of false positives, and FN is the number of false negatives. We can do better though. For some discussion, issues, and alternates see "Evaluation: From Precision, Recall and F-Measure to ROC, Informedness, Markedness & Correlation" (2011) by David M W Powers in the Journal of Machine Learning Technologies 2 (1): 37–63. 2013-12-28 :   Hacker's image gallery   Two lines is all it takes for hacker's image and video gallery: for f in *.jpg ; do echo "" ; done > index.html for v in *.webm ; do echo "Click to download $v " ; done >> index.html 2013-07-04 : Resistor identification I found Zach Poff's Resistor Identification PDF to be very helpful, but the SMD and +/-1% tolerance sections were not very useful to me when sorting a bag of cheap resistors, so I made a Hobbyist Resistor Identification Codes PDF (from an SVG) in Inkscape of something easy to print out. 2013-04-10 : Kōaning kōans Kōans are universal Computer Science folklore. Especially the AI ones. Recursion is a fundamental structure. Lisp-in-lisp is seen as a great achievement, as is when any programming language becomes self-hosting. I had never connected these very deeply, possibly because I didn't know the first thing about kōans: "...in the beginning a monk first thinks a kōan is an inert object upon which to focus attention; after a long period of consecutive repetition, one realizes that the kōan is also a dynamic activity, the very activity of seeking an answer to the kōan." [1] 2012-08-27 : Elisp function to add-newline-and-ident-after-next-comma I am actually amazed I've used emacs for programming for so long without writing this; bind to Ctrl-, and wonder how you lived without it:  (defun mtd-indent-after-comma-space () (interactive) (re-search-forward ",[ ]+" (line-end-position) t) (replace-match ",") (newline-and-indent)) (global-set-key (kbd "C-,") 'mtd-indent-after-comma-space)  2012-08-26 : Python default argument redux This is the right use of python's powerful default kwargs feature:  import time def report(when=lambda: time.time()): print when() report() time.sleep(5) report()  The wrong one is documented all over, but often without a clear statement that "Default parameter values are evaluated when the function [is defined]"ref. I think it's more useful to ask people to take away the lambda: and ask themselves why that doesn't work as they probably would expect. 2012-02-10 : Quickfix 1.33 for Fedora 16 I've made new RPMs of QuickFIX 1.13.3 for Fedora 16 that you can download; the patches are also available separately. I'm working on getting QuickFIX officially into Fedora; if you're interested in helping please check redhat bugzilla #606421. 2011-05-29 : Quickfix 1.33 for Fedora 15 I've made new RPMs of QuickFIX 1.13.3 for Fedora 15 that you can download; the patches are also available separately. Thanks to Dennis Fleurbaaij for some patches. I'm working on getting QuickFIX officially into Fedora; if you're interested in helping please check redhat bugzilla #606421. 2011-03-31 : tracker hacking I backported tracker's "index & search for numbers" functionality (without its config-ui nicety) from tracker-bug #503366 to Fedora 14's tracker now-ancient 0.8.17 so I could search for numbers; the change was surprisingly small given how much tracker has moved on: diff -ur tracker-0.8.17-p2/src/libtracker-fts/tracker-parser.c tracker-0.8.17-patch-index-numbers-gnome-bugzilla-num-503366/src/libtracker-fts/tracker-parser.c --- tracker-0.8.17-p2/src/libtracker-fts/tracker-parser.c 2011-03-30 18:11:50.717630143 +0800 +++ tracker-0.8.17-patch-index-numbers-gnome-bugzilla-num-503366/src/libtracker-fts/tracker-parser.c 2011-03-31 00:48:06.160630144 +0800 @@ -354,8 +356,7 @@ } } - if (!is_valid || - word_type == TRACKER_PARSER_WORD_NUM) { + if (!is_valid) { word_type = TRACKER_PARSER_WORD_IGNORE; is_valid = TRUE; length = 0; @@ -378,18 +379,9 @@ if (!start) { start = g_utf8_offset_to_pointer (parser->cursor, char_count-1); - /* Valid words must start with an alpha or - * underscore if we are filtering. - */ - - if (type == TRACKER_PARSER_WORD_NUM) { - is_valid = FALSE; - continue; - } else { - if (type == TRACKER_PARSER_WORD_HYPHEN) { - is_valid = parser->parse_reserved_words; - continue; - } + if (type == TRACKER_PARSER_WORD_HYPHEN) { + is_valid = parser->parse_reserved_words; + continue; } } @@ -459,7 +451,9 @@ return FALSE; } - if (word_type == TRACKER_PARSER_WORD_ALPHA_NUM || word_type == TRACKER_PARSER_WORD_ALPHA) { + if (word_type == TRACKER_PARSER_WORD_ALPHA_NUM + || word_type == TRACKER_PARSER_WORD_ALPHA + || word_type == TRACKER_PARSER_WORD_NUM) { gchar *utf8; gchar *processed_word; It's very nice to be able to search for numbers now. 2011-03-29 : Lisp on Fedora revisited I wanted to see what I could do with vecto so I had another go at Common Lisp on Fedora, almost three years since it crashed and burned for me. "yum install sbcl" was fine, but then hit the same symptom as years ago: libraries don't install using ASDF-INSTALL (this was the "old" way, even in in 2008). Luckily, the other old symptom of "libraries don't install the new way, either" has changed: the old old way is gone, the old new way is now the old way, and the new way is really new, and actually works: sudo yum -y install sbcl git clone git://gitorious.org/clbuild2/clbuild2.git cd clbuild2 ./clbuild quickload vecto ...lots of downloading ./clbuild prepl --noinform ...much amusement with dumping core files and such * (require 'vecto) much system loading NIL *  If things keep working this well soon people will be asking heretical things like "stop telling me about system loading when it works". 2010-12-02 : Converting scanned pdfs to text on linux (OCR) To convert scanned PDFs to text on linux - where the PDFs don't already have text embedded in them, of course - I have come up with the below process using tesseract for the OCR and unpaper to correct for the vagaries of the scanning process (slightly off-kilter document images, for example). I've started with PNG files, but of course you can just use ImageMagick to convert foo.pdf foo.png to get n foo-i.png files for an n-page PDF.  for f in${output_basename}*.png ; do stem=$(basename$f .png) pngtopnm ${stem}.png >${stem}.pnm | tee -a ${output_basename}.log unpaper -b 0.5 -w 0.8 -l single -s a4${stem}.pnm ${stem}-unpapered.pnm | tee -a${output_basename}.log convert ${stem}-unpapered.pnm -depth 8 -monochrome${stem}.tif | tee -a ${output_basename}.log tesseract${stem}.tif ${stem}-tesseract | tee -a${output_basename}.log done If you're using Fedora linux, yum can install all the prerequisites for you:  sudo yum -y install ImageMagick tesseract poppler-utils unpaper netpbm-progs 2010-10-31 :   Using Python modules from C Python extensions   When writing a Python extension module, the basics of using another python module's attribute are not overly documented. Constructing an instance of another module's class, for example, is a bit of a mystery. There are at least two object creation primitives and a lot of C APIs to get in the way of integrating gracefully with python-implemented code. Here's how one can do it: To import a module, normally one would want to have the same thing happen as when import modulename is interpreted. This is done by the PyImport_Import function, like this: PyObject *mod_name = PyString_FromString("modulename"); PyObject *module = PyImport_Import(mod_name); Py_DECREF(mod_name); if (module == NULL) { Py_RETURN_NONE; }  Then, to get a module's attribute, like decimal.Decimal, recall that module (and class) attribute access (the ".") means the same thing as looking up the name in the module's dictionary. The general "get some attribute of this object" function is PyObject_GetAttrString, and it's used like this (continuing the code from above): PyObject *attribute = PyObject_GetAttrString(module, "attribute"); if (attribute == NULL) { Py_DECREF(module); Py_RETURN_NONE; }  Call the object's constructor. Again, we want to do what the interpreter would do when confronted with module.attribute(); this is PyObject_CallObject: attribute_ctor_args = Py_BuildValue("(s)", value.c_str()); PyObject *attribute_instance = PyObject_CallObject(attribute, attribute_ctor_args); Py_DECREF(attribute_ctor_args); Py_DECREF(attribute) if (attribute == NULL) { Py_RETURN_NONE; } else { return attribute; }  We could pass arguments to the constructor, if we create them first - perhaps using Py_BuildValue: PyObject *attribute_ctor_args = Py_BuildValue("(s)", "any char * will do"); PyObject *attribute_instance = PyObject_CallObject(attribute, attribute_ctor_args); Py_DECREF(attribute_ctor_args);  2010-09-28 :   The last rosetta stone you'll need   2010-07-05 :   Saving your MBR   Save/restore your MBR: without the partition table: dd if=/dev/hda of=hda-mbr-nopart bs=446 count=1 with the partition table: dd if=/dev/hda of=hda-mbr-full bs=512 count=1 just the partition table: dd if=/dev/hda of=hda-parttable bs=64 count=1 skip=446 2010-06-20 :   QuickFIX patches to build 1.13.3 on Fedora 12 and higher   I've got some patches and an updated specfile to get quickfix 1.13.3 built on Fedora 12 and higher. This includes the python bindings (tested a bit) and examples (untested). You can download the rpms here. I am going to work on getting them submitted to the quickfix project and into Fedora. 2010-06-17 :   QuickFIX concepts   The absolute minimum you need to know about QuickFIX / FIX programs is: FIX specifies how processes talk to each other over a transport layer like TCP/IP. FIX clients talk (initiate message exchanges); FIX servers listen (finish message exchanges). FIX servers can do different things, like process orders or send market data to clients. The first thing you will usually write with FIX is a client. It will open a port to a FIX server and send a FIX login message. FIX messages are key/value pairs where the keys are numbers and the messages are sent over the wire as header and message data - the message data will be almost human-readable. A FIX message looks almost like this on the wire: E.....@.@.N{RDPl...]. .P $+..%%....6...... {....]Q.8=FIX.4.4.9=82.35=5.34=6.49=XXXXXXXXXXXXXXXXXXXXXX.52=20100617-10:37:36.348.56=XXXX.57=XXXXXXXXX.10=249. Notice the "8=FIX..." part, which is the message / payload starting. There are different versions of FIX...versions 4.1 to 4.4 are most prevalent, and v5.0 is the latest. They're not terribly compatible. And just because an app is FIX vX.Y compliant doesn't mean you'll be able to rely upon much of the content of key data fields...like any message format, the message contents a) matter; and b) are often abused by low-quality implementors. Can you think of anythine else? Let me know via email. 2010-05-13 : QuickFIX on Fedora 12 Here's how I got quickfix running on Fedora 12:  wget http://kojipkgs.fedoraproject.org/packages/quickfix/1.12.4/9.fc12/x86_64/quickfix-1.12.4-9.fc12.x86_64.rpm sudo yum --nogpgcheck localinstall quickfix-1.12.4-9.fc12.x86_64.rpm  Ok, so that's just to get libquickfix.so. To actually do something useful, like connect to a FIX endpoint, there's more. But before we get to that, how about building the latest version of quickfix: #standard rpmdev fedora setup: sudo yum -y install yum-utils rpmdevtools rpmdev-setuptree # verify that we can build the existing RPM sudo yum-builddep quickfix-1.12.4-9.fc12.src.rpm rpm -Uvh quickfix-1.12.4-9.fc12.src.rpm rpmbuild -ba ~/rpmdev/SPECS/quickfix.spec #WORKSFORME. now let's upgrade to 1.13.3: sed -i -e 's/1.12.4/1.13.3/; s/9{%dist}/1{%dist}/' ~/rpmdev/SPECS/quickfix.spec rpmbuild -ba ~/rpmdev/SPECS/quickfix.spec  Next update: getting python, ruby, and java bindings built. 2010-02-03 : Maemo 5 SDK + Fedora hacking There are a few guides to installing a Maemo 5 scratchbox on a Fedora 12 x86_64 machine, but none of them worked for me without modification. In particular, the vm.mmap_min_addr magic really ate up a few minutes. To save one or two other people some time, here's what worked for me: # do this as root yum -y install Xephyr dbus hal groupadd sbox usermod -a -G sbox$LOGNAME echo >> sysctl.conf <> /etc/apt/apt.conf sysctl -p  # do this as non-root user wget http://repository.maemo.org/stable/5.0/maemo-scratchbox-install_5.0.sh http://repository.maemo.org/stable/5.0/maemo-sdk-install_5.0.sh chmod a+x ./maemo-scratchbox-install_5.0.sh ./maemo-sdk-install_5.0.sh patch maemo-scratchbox-install_5.0.sh <

"May ours be the noble heart
Strong to endure
Daring tho' skies be dark