2019
Algorithms in Bioinformatics: A Practical Introduction. (via SO)
I haven’t yet embraced the full power of Julia for data munging, but surely this
article is a gem to understand the language at a deeper level. #julia
Useful tips to build and manage R packages: rOpenSci Packages: Development,
Maintenance, and Peer Review. #rstats
Probability and Statistics: a simulation-based introduction, by Bob Carpenter. I
like it when there are instructions for those like me who do not want to install
RStudio to build the book. #rstats
Causal Inference Book, Python code hosted on GitHub (by the author of the Stata kernel). (via @kaz_yos)
How to stay as private as possible on Apple’s iPad and iPhone. (via Irreal)
I’m halfway thru my new TV show (Occupied), but I’m struggling to motivate
myself to move forward right now, even to watch TV right now. Besides that, I’m
finally getting a job back. Let’s just hope I don’t go back to the hospital too
soon. #self
Why the 3? Earlier in the morning I was reading one of the latest posts published by John D. Cook about dose finding studies. I am well aware of the 3+3 design. Incidentally, I attended a meeting yesterday where a PhD student was presenting his work in microbiology, and they used triplicates. It is interesting that the same 3 seems like a magic number here, but it is not the same. Maybe I should drop a note in a few days.
Not sure how we can think of GTD when we spend about one hour cleaning up defunct stuff on our HD, but sure we are close…
One of the first hit when looking for “Lisp and bioinformatics” on the internet:
How the strengths of Lisp-family languages facilitate building complex and
flexible bioinformatics applications. #lisp
I’ve been following Greg Stein on Caches to caches for a long time now, because the site has such a beautiful design and useful material on Emacs and Org mode. Recently they published a series of posts on AI and ML.
disk.frame is a new (dplyr-compliant) R package to manipulate structured tabular
data that doesn’t fit into RAM, in the spirit of Dask for Python. #rstats
Another nice article about GTD by BSAG. I enjoy reading her blog posts, and I really love her website design. Funny thing: I was just reading some old posts written by Bastien Guerry on Org mode.
Gary Peacock, Jack DeJohnette & Keith Jarrett, My Foolish Heart (Live at Montreux).
Jack DeJohnette, Ravi Coltrane & Matt Garrison, In Movement.
Portacle is a complete IDE for Common Lisp that you can take with you on a USB stick.
If you are looking for a quick solution, here it is. Otherwise, learn Emacs for
good. #emacs
Staying with Common Lisp. Safe no move perhaps? On a related note, here is an
enlightening discussion about Racket vs. Lisp: Why I haven’t jumped ship from
Common Lisp to Racket (just yet). #lisp #scheme
While I usually run Slime for little Lisp hacking, I noticed that serious people
are looking at SLY, the Sylvester the Cat’s Common Lisp IDE for Emacs. It looks
like there is even a Spacemacs layer. #emacs #lisp
Interesting read. (via Daniel Lemire)
Though we age, it is unclear how our bodies keep track of the time (assuming they do). Researchers claim that our blood cells could act as time keepers. When you transplant organs from a donor, they typically behave according to the age of the recipient. However, blood cells are an exception: they keep the same age as the donor. What would happen if we were to replace all blood cells in your body with younger or older ones?
Yet another mind-mapping tool if you are not ued to Emacs Org mode: Hook. (via Jack Baty)
After Jupyter notebook, we now get Jupyter book. Looks like a serious
alternative to RMarkdown/Gitbook (aka bookdown). #python
Just found Racket Machine Learning – Core. #scheme
Machine Learning Refined, with nice blog posts by Jeremy Watt & Reza Borhani.
I still read Mastering Emacs from time to time. Recently, I was just checking
an article on regular expression. I have been using Emacs for about 15 years and
I am afraid that now I would be far more comfortable with most key chords after
two or three years of Spacemacs. This is not that I really like modal editing–I
don’t like it at all in fact–but the consistent key bindings conveyed via
which-key and the configuration layers for most packages make it a really
pleasant tool to use on a daily basis. I’ve come to have only Emacs on my
desktop. No more iTerm2 or Marked2 or even Desktop icons. #emacs
The more I use Org for authoring simple or more complex text documents, the more I like. I like to think of it as Markdown with better markup for links, code blocks, tables, and references, and of course there’s Emacs inline preview. Except for collaborating with colleagues or drafting short RMarkdown documents, I mostly stopped using Markdown these days. Maybe I should just revisit some old Md files and just convert them to Org.
(defun markdown-convert-buffer-to-org ()
"Convert the current buffer's content from markdown to orgmode format."
(interactive)
(shell-command-on-region (point-min) (point-max)
(format "pandoc -f markdown -t org -o %s"
(concat (file-name-sans-extension (buffer-file-name)) ".org"))))
See also: Org-Mode Is One of the Most Reasonable Markup Languages to Use for Text.
A few days ago, I noticed someone citing A Computational Approach to Statistical Learning on Twitter. I no longer buy statistical books so I can’t tell if it is worth a read, but I note that the author of the R package bigmemory is one of the co-authors.
Just found what I think is one of the best concise tutorial on “How to GitHub“ if you are looking to collaborate on a common repository. As always, it works best when you read the Magit manual and check what’s available there.
How to blog. Nice take by Tom MacWright. I don’t have a very strict schedule. However, I’ve been trying to post more or less regularly in recent years (sometimes even just links of Twitter bookmarks), specifically to avoid letting my blog die.
Today I was reading Jack Baty’s latest posts and I noticed an interesting micro-post about keyboard versus mouse usage.
The stopwatch consistently proves mousing is faster than keyboarding.
I think this deserves two additional remarks. First, it depends on the task at hand: For instance, even if I prefer reading email with Apple Mail I use mu4e under Emacs because I find it more convenient for bulk actions like archiving or deleting a bunch of messages. Think of it a little: You just have to use your preferred movement keys or the arrow keys and strike a key, and it’s all done! Likewise, for text editing or interacting with an REPL, I found Emacs keybindings much more powerful than any combination of custom Services or even TextExpander, together with using a mouse. I believe Vim users would agree as well. Second, this does not account for people not using a mouse at all. I for one have always been very happy with Macbook trackpad, and I come a lot slower when I have to use a mouse, notwithstanding the fact that it is very bad practice for the elbow and wrist. For most movement, I use the trackpad and I do not worry much about Emacs or Vim keybindings, because there I am faster with the trackpad. Hence, we should better clearly state what actions are better performed using a mouse before claiming than the mouse win over the keyboard.
After attending months of Twitter discussion about what could be the best
software–R or Python–for data science several months ago, this is now the time
of the R vs. Stata debate, here and there. Arguably, Stata is a paid software
and does not offer the same scripting facilities than R for some tasks, mainly
non-statistical tasks. However, what’s the point? Did anyone ever mentioned the
fact that Stata has a GUI which completely mimics the command-line operations,
so that people afraid of typing commands or just interested in running a
logistic regression on a well-formed dataset can just do it in under a minute?
It is slow with some estimators or optimization approaches (e.g., gglamm), and
we had to wait a bit long to get full support for unicode and XLS, better
graphical rendering, etc. But the versioning system allows to repoduce any
result prior to the current version of Stata. And it does interact very well
with Stan and R, too. The question is not which software is better, the real
question is who’s the end user? #rstats #stata
Fun fact: I saved a database from Stata 15 in old format (i.e., compatible with
Stata 13). I cannot view unicode characters in Stata GUI, but it works perfectly
fine when run through Emacs/ESS! #stata
Back to a fully functional Spacemacs, after a complete reinstall. Some minor
annoyances with MELPA actually, but nothing serious; fixed a weird bug with the
ocaml layer, since I learned that the syntax-version layer should come before
ocaml, but otherwise everything is fine. Also, I’m trying to go all Helm
instead of Ivy. #emacs
I am not very lucky with Spacemacs these days. Now, SPC-/ to search project for
text (aka, spacemacs/search-project-auto) is no longer working. Not funny, trust
me. #emacs
A recent tweet reminded me of gtools, a Stata package that aims to speed up
built-in command for data wrangling. I should give it a go. #stata
Hacker Tools: A user-friendly introduction to various command line utilities, editors and VCS. (via @newsycombinator)
Apache Arrow and Feather are two interesting projects that I think should be
available in data science-related PLs. Recently, Rust joined the list, at least
regarding Arrow: DataFusion: A Rust-native Query Engine for Apache Arrow. #rust
📖 Zoé Valdés, Une habanera à Paris (Gallimard, 2005)
Magithub (soon, forge) is now part of Spacemacs/magit. No need to add further
configuration to your init.el. Today I was trying to send an issue for on one of
my repo and I figured out that there’s some trouble at the moment. #emacs
Just throw out more than 30k messages from my Gmail account. I have a local copy, so no worries, but the Google team will have a harder time to analyze it. Incidentally, I just came across a new testimony from people tired of Google.
Last round shown below:
BTW, did you know that Google actually stores everything you buy based on payment or shipping receipts?
Updating my global dist for the newly released v1.1 of Julia. Installing
packages is much easier (e.g. Gadfly) and smoother compared to the preceding
versions (prior to v1). Only caveat is that rendering plot via Gadfly is kind of
slow, especially compared to other graphing engines (R, Gnuplot, Mathematica, or
even Stata). #julia
