(1232 Posts In Total)

2019

2019-02-27 09:25

Look, I read two of the latest newletters by Sacha Chua and I already learned about two new Org features: org-reverse-datetree and org-bib-template. Moreover, I didn’t know that there were such thing as meta repository for ESS users. #emacs

2019-02-27 07:21

I’m finally done with Occupied.

2019-02-26 21:32

I think this is the first time this site is referenced in Sacha Chua excellent Emacs newsletter.

2019-02-26 17:53

Mathematica implementations of machine learning algorithms used for prediction and personalization.

This open source project is for Mathematica implementations of statistical and machine learning algorithms that can be used for data analysis, prediction, and recommendation systems.

Note that the Github repository also includes Lua, Java and R code. The companion website is Mathematica for prediction algorithms.

2019-02-26 17:23

A few days ago, I read a thread on Biostars (which I haven’t consulted in a while) on the use of Wolfram mathematica in bioinformatics, and I wondered why people are so critical of this software. The same applies to Stata (if you see the recent flame on Twitter, you know what I mean), albeit in this case there’s not even this man behind it.

2019-02-26 09:26

Long time no see. I have been compiling several pieces of bioinformatics software lately. No issues whatsoever, except for a few glitch with boost libraries.

2019-02-25 20:38

I just added permalinks in this section (here, a small hash symbol near the date). I was missing a way to link to previous micro-posts.

2019-02-25 20:30

I’m almost done with Occupied. I initially thought I would be able to finish the last two episodes of the first season this evening, but I’m so tired (I’m up since 4am) that I’m afraid I won’t be able to stand up for long.

2019-02-25 18:23

  Peter Erskine Trio, As It Was.

2019-02-25 18:23

I am still unsure how best to use org-journal. I already use a “diary” file where I bookmark important stages of my working day. This way, I get a nice summary with org-agenda. Obviously, I could do exactly the same using org-journal, but I was thinking that it could also be used to record my posts on the main site: (1) I would be writing using Org mode directly, (2) I would get a searchable archive from Emacs directly (and more convenient than deft), and (3) that would be just cool. #emacs

2019-02-25 12:21

How to delete empty lines in a file by Emacs? Useful to clean up an HTML page with lot of extra blank lines. #emacs

M-x flush-lines RET ^[[:space:]]*$ RET
2019-02-25 10:50

This moment when you realize that you are stuck with Java 8 on your OS… Two options: use Homebrew (brew cask install java) or proceed manually. I think I will love bioinformatics tools.

2019-02-23 20:51

It’s been a while since I haven’t run any ML model using caret, especially since Max Kuhn engaged in the RStudio team to develop a brand new ML pipeline in the name of the tidy new wave: tidymodels, then parsnip (slides near here). Anyway, here is a good tutorial if you want to get started with caret. (via @R_Programming) #rstats

2019-02-23 20:34

And we are finally done with The 100. Looking forward to looking to The Expanse during winter holidays.

2019-02-23 17:56

When you insist on your CLI-based workflow (reproducibility, text-based, etc. you know…) and you realize that Stata 13 does not recognize graph export with a PDF backend (while Stata 15 does) from a Terminal. Back to Encapsulated PostScript then, like in the 90s! #stata

2019-02-23 08:26

While I appreciate that there are so useful Docker images available, I think I will need to build a more lightweight one if I want to stay on CircleCI free plan. Hopefully, it looks like someone already had the same idea. #rstats

2019-02-23 08:02

Statistical Thinking for the 21st Century. #rstats

2019-02-22 18:56

The first edition of Interpretable Machine Learning is out. (via @ChristophMolnar)

2019-02-22 18:56

Emacs build-status: a nice package that allows to monitor build on Travis or CircleCI. #emacs

2019-02-22 18:46

Yet another org-powered website. This makes me think that I added a little org-capture template to write those micro-posts without having to open my micro.org file. #org

("b" "Blog post" entry (file+headline "~/org/micro.org" "Micro")
     "** TODO %?\n:PROPERTIES:\n:EXPORT_FILE_NAME:\n:END:\n%^g\n"
     :empty-lines 1)
2019-02-22 18:05

Stephen Wolfram reflecting on his “productive” and digital life. What a man!

2019-02-22 07:46

Didn’t know either: Beware that wc counts newlines, and not lines. (via Irreal)

2019-02-21 19:04

Despite the useful utility under the “File” menu, my attempt at installing a Mathematica package properly failed miserably this morning. I ended up copying/pasting the wole archive into ~/Library/Mathematica/Applications. Anyway, this worked and I am now able to plot phylogenetic trees!

2019-02-21 18:51

Didn’t know there was such a thing: MacJournal (via Jack Baty). Whether you are interested in this app or not, the author provides a nice discussion of the pros and cons of keeping a diary vs. a journal, and on the importance of meta data.

2019-02-21 14:47

Merlin Mann et Marie Kondō sont dans une d’emails, by Bastien Guerry. Nice summary of the situation regarding emails. I already deleted 30k+ mails in one pass so I know what batch processing is.

2019-02-20 20:30

Discrete Stochastic Processes. It’s amazing how many excellent tutorials can be found on the MIT OpenCourseWare.

2019-02-20 20:14

  Morcheeba, Who Can You Trust?.

2019-02-20 20:08

I disabled Dropbox syncing on my Mac for a long time now, but I realized yesterday that Transmit allows to connect to Dropbox very easily now. Even if I no longer use Dropbox these days, that may be a very good option for the future.

2019-02-20 19:55

After jupyter-book, there is now jupytext (via @marcwouts). Looks like we now have a serious competitor to RStudio. #python

2019-02-20 19:50

  Nick Cave & The Bad Seeds, Nocturama. I’m often lazy when it comes to changing a CD.

2019-02-20 10:51

Two handy org commands: org-journal-new-scheduled-entry can be used to schedule future entries in org-journal (see discussion here); org-tree-to-indirect-buffer is a good alternative to org-narrow-to-subtree sometimes. #org

2019-02-19 20:25

Diving into computational molecular biology. It’s a fun world after all, especially compared to medical statistics. I am trying to devise a reliable workflow for taking notes and using a live notebook, mostly inspired from my old setup, but basically it’s all about Org files with tags and “TODO items”, including a diary and helm-bibtex for managing my bibliography. Nothing fancy, but it just has to do the job right after all.

2019-02-19 20:16

  Timber Timbre, Timber Timbre.

2019-02-19 20:14

Today’s lunch:

2019-02-18 20:41

  Nick Cave & The Bad Seeds, Nocturama.

2019-02-18 19:00

Pretty Magit - Integrating commit leaders. I have been using Git leaders for almost two years, but now I realize that I completely forgot about them.

2019-02-18 18:55

  New Order, Power, Corruption & Lies.

2019-02-18 18:55

Today was my first day at my new lab. Everything went fine, despite a very bad night. At least I have been able to go back home without too much dizziness or paresthesia in the legs (I don’t know where this one comes from). Guess what: For the first time in 10 years, I am able to connect my Macbook on the network! #self

2019-02-18 18:50

I am reading the Racket guide again, this time using Dash only. It’s amazing how convenient this application is, especially for navigating between text and function definitions, which by default are all hyperlinked thanks to the Scribble documentation system. #scheme

2019-02-17 20:35

  Nick Cave & The Bad Seeds, Push the Sky Away.

2019-02-17 20:29

I am about to exceed the 150th micro-posts in my Org file. (Other posts are published from the terminal directly.) I added a little cookie to keep track of the number of entries, although a little harder path would be to write some elisp code. #org

2019-02-17 20:19

I don’t have any big needs in terms of image processing, and I am generally happy with ImageMagick. However, Acorn and Retrobatch (h/t Brett Terpstra) look pretty nice.

2019-02-17 18:39

  Nick Cave & The Bad Seeds, The Boatman’s Call.

2019-02-17 18:34

Just cleanup a little bit more my Dropbox (6 Go of data, reports and papers accumulated along 8 years!).

2019-02-17 18:15

Machine learning in Clojure with XGBoost. Note that there are bindings for the awesome xgboost in various other languages (Python, Julia, R), not just the JVM. #clojure

Python didn’t become the leader in the field because it’s inherently better or more performant, but because of scikit-learn, pandas and so on. While as Clojurists we don’t really need pandas (dataframes) or similar stuff (everything is just a map, or if you care more about memory and performance a record) we don’t have something like scikit-learn that makes really easy to train many kind of machine learning models and somewhat easier to deploy them.

2019-02-17 18:05

merlin - a unified framework for data-analysis, and many other interesting packages by the same author or other coworker. #stata

2019-02-17 12:07

Again, I’m slowly updating stata-sk. It took me a while to reset the publishing system to use Stata 13 MP instead of Stata 15 since I no longer get a free license for it. This will probably be my last textbook on Stata. #stata

2019-02-17 08:51

Look. Even Racket has some support for statistical data structure like data frames. In addition, here is an essential read if you want to get started with common data structures: An Overview of Common Racket Data Structures. #scheme

2019-02-16 14:14

An analysis of lossless data compression programs: Large Text Compression Benchmark. (via SO–it looks it is the very first question on the beta site)

The amount of genomic sequence data being generated and made available through public databases continues to increase at an ever-expanding rate. Downloading, copying, sharing and manipulating these large datasets are becoming difficult and time consuming for researchers. We need to consider using advanced compression techniques as part of a standard data format for genomic data. The inherent structure of genome data allows for more efficient lossless compression than can be obtained through the use of generic compression programs. We apply a series of techniques to James Watson’s genome that in combination reduce it to a mere 4MB, small enough to be sent as an email attachment. – Human genomes as email attachments