can anyone validate this?

Wed Aug 22 11:59:55 PDT 2012

On Wed, Aug 22, 2012 at 11:46 AM, Doug Hunley <doug.hunley at gmail.com> wrote:
> We've got an application here at work that writes heavily to /data,
> which is a fiber san. One of the writes is to /data/indexes which has
> become *very* access -time-sensitive in regards to the application
> performance. A fancy (and expensive) consultant suggested we create a
> tmpfs mounted at /indexes and point the app there since writing to
> memory is always faster than writing to disk. I then asked how we
> handle getting those indexes written down to disk in the background
> periodically, and he said to use symlinks! In his thinking, doing
> this:
> /indexes/data.idx -> /data/indexes/data.idx
> will give us 'memory speeds' for the write as far as the app is
> concerned and then 'the kernel will background flush to the san'. I
> think he's full of shit, but I wanted to ask the gurus here before
> starting a fight :)

He's full of shit.  The same IO bottlenecks that exist when writing
directly to disk exist when writing over a symlink.

Also, if you're chucking all of this super important data into tmpfs
and praying that it gets written to the (over the network) SAN during
a background flush, what happens if the server crashes before that
write happens?  Its not out of the realm of possibility for the server
to have enough RAM that quite a lot of data could be floating around
in memory for a while before its written off to disk.

Regardless, you could test this entire scenario out relatively easily
by setting up this silly symlink from tmpfs thing, and trying to write
a bunch of huge files (bigger than the RAM+swap), and seeing what kind
of perf you get.  My guess is that it will go to shit as soon as you
write more data than can fit in RAM.