Friday, 26 February 2016

Ed Oates, from Oracle

Useful insights from one of the Oracle Co-founders https://vimeo.com/30929523

I think this part of the slide has a few good key points, as he describes in the vid. I found the first half the best and the second half a bit more business orientated on some specifics ( such as patenting )


Wednesday, 3 February 2016

Great Server hardware

Feast your eyes on these:

This 90 bay 4U monster http://www.supermicro.co.uk/products/chassis/4U/946/SC946ED-R2KJBOD.cfm  and it is just over 102 kg!


Go back around 10 years first one I know of is this http://techreport.com/blog/13849/behold-thumper-sun-sunfire-x4500-storage-server 

48 NVMe? - https://www.supermicro.com/products/system/2U/2028/SSG-2028R-NR48N.cfm

http://www.theregister.co.uk/2016/02/01/product_blastoff_by_emc/


Thursday, 7 January 2016

Nvidia, raising standards

Recently looking into the changes from the current card I've got (Nvidia GTX 680) vs the newer ones for the hardware and architectural changes is interesting and where the next phases lead.

This public doc is good as a showcase for some of the main reasons Kepler-GK110-Architecture

and compares what the goals and changes that have been done from Fermi to Kepler. I quite like some of this including the Atomic operation improvements for one.

Gives some simplified points then in the next going from Kepler to Maxwell - Top 5 things to know about Maxwell

Overview - Maxwell

This is great because you can be a card now with almost 1,000 cores for under £100 without requiring separate power connectors either. So saves money on buying a PSU to support it as you had to before and the electric bill will be lower. Doing this on a large scale simply means many more people will find upgrading their older systems/desktops more of a viable option whilst in the HPC and other distributed computing projects can spend less £ and save running costs on a larger infrastructure and now combined with this Micron 8GB GDDR5 are working to produce greater amounts of memory that can be used with a graphics card, combine this with Nvidia improvements will also benefit those who love high end games will probably like to transition to 4K monitors. So much more processing required across many more pixels, we also have things like new video compression/decompression HEVC being introduced as well. Just need GPU to decode...

GeForce 1000 series  - the next gen, named "Pascal"

Also this NVLINK is great! - nvlink  what is NVlink?"5-12x higher bandwidth" etc

Excellent news for HPC, I'd like to see progress made with nuclear fusion reactors for defo. - exascale-supercomputing


The other point to mention is the clock speeds are lower in the newer architecture whilst still giving better performance. As the frequency of a clock is higher the amount of heat generated is higher too so therefore cooling in this case will be less noisy (if using a fan) as it won't be as hot. A CPU or GPU that runs at a higher clock rates require substantially more power. i.e. a 4GHz core will use much more than double power of a 2GHz core. (If assuming cores are of the same architecture). Can always look up "cpu clock speed vs power consumption" along with the charts, docs etc if anyone disagrees.

Forgot to mention that it is also helpful to programmers & developers.. and it is recommended you read all the way through this doc due to some key considerations - Is parallel programming hard?



Tuesday, 22 December 2015

ZFS, like a work of art

A subtle thing with ZFS is you'll notice how the drive L.E.D.s flash quite differently to typical storage arrays, when you understand more under the hood you'll know why that is. So just looking in a DC you'd be able to observe this across which servers for example. You can see this type of effect here to illustrate - https://www.youtube.com/watch?v=LS3cfl-7n-4

ofc thats ZFS on linux.. which is implemented as a FUSE so less efficent than that of a FS in kernel space as elaborted across various posts, some examples: https://lkml.org/lkml/2007/4/16/133 , https://lkml.org/lkml/2007/4/16/83

example pool using raidz2 with hot spares, which will autoreplace in the event a drive or 2 fail. Creating with brackets like this is always easier - c4t{0..1}d0. Also have to get the order of commands to be correct or you may be second guessing...

# zpool create data c0t50004CF210AD1C22d0 c0t50004CF210BE51F1d0 c0t50004CF210BE51F3d0 c0t50004CF210BE5214d0 c4t{0..1}d0 raidz2
Unable to build pool from specified devices: invalid vdev specification: raidz2 requires at least 3 devices

# zpool create -o atime=off -o compress=lz4 data raidz2 c0t50004CF210AD1C22d0 c0t50004CF210BE51F1d0 c0t50004CF210BE51F3d0 c0t50004CF210BE5214d0 c4t{0..1}d0
# zpool add data spare c4t3d0 c5t3d0
# zpool status
  pool: data
 state: ONLINE
  scan: none requested
config:

        NAME                       STATE     READ WRITE CKSUM
        data                       ONLINE       0     0     0
          raidz2-0                 ONLINE       0     0     0
            c0t50004CF210AD1C22d0  ONLINE       0     0     0
            c0t50004CF210BE51F1d0  ONLINE       0     0     0
            c0t50004CF210BE51F3d0  ONLINE       0     0     0
            c0t50004CF210BE5214d0  ONLINE       0     0     0
            c4t0d0                 ONLINE       0     0     0
            c4t1d0                 ONLINE       0     0     0
        spares
          c4t3d0                   AVAIL  
          c5t3d0                   AVAIL

Then as always test the assumption and it works as expected. I've got hot swap capabilities so pulled a drive out to simulate then try write some data and looks to have worked.

# zpool status -xv
  pool: data
 state: DEGRADED
status: One or more devices are unavailable in response to persistent errors.
    Sufficient replicas exist for the pool to continue functioning in a
    degraded state.
action: Determine if the device needs to be replaced, and clear the errors
    using 'zpool clear' or 'fmadm repaired', or replace the device
    with 'zpool replace'.
  scan: resilvered 136K in 1s with 0 errors on Wed Dec 23 05:53:44 2015

config:

    NAME                         STATE     READ WRITE CKSUM
    data                         DEGRADED     0     0     0
      raidz2-0                   DEGRADED     0     0     0
        c0t50004CF210AD1C22d0    ONLINE       0     0     0
        c0t50004CF210BE51F1d0    ONLINE       0     0     0
        spare-2                  DEGRADED     0     0     0
          c0t50004CF210BE51F3d0  UNAVAIL      0    24     0
          c4t3d0                 ONLINE       0     0     0
        c0t50004CF210BE5214d0    ONLINE       0     0     0
        c4t0d0                   ONLINE       0     0     0
        c4t1d0                   ONLINE       0     0     0
    spares
      c4t3d0                     INUSE  
      c5t3d0                     AVAIL  

device details:

    c0t50004CF210BE51F3d0      UNAVAIL       too many errors
    status: FMA has faulted this device.
    action: Run 'fmadm faulty' for more information. Clear the errors
        using 'fmadm repaired'.
       see: http://support.oracle.com/msg/ZFS-8000-FD for recovery

Tuesday, 3 November 2015

ZFS born in Zion

Interesting vids from the recent OpenZFS Summit 2015. Recommend you watch these - https://www.youtube.com/watch?v=dcV2PaMTAJ4&index=6&list=PLaUVvul17xSedlXipesHxfzDm74lXj0ab

As Jeff Bonwick explains around the time of ZFS conception that it has links to The Matrix. That's why Oracle documentation has things in there about Neo, Trinity, tank and Morpheus. Amazing film with memorable quotes:

Morpheus: "You're faster than this. Don't think you are, know you are."
Morpheus: "I'm trying to free your mind, Neo. But I can only show you the door. You're the one that has to walk through it"

Let's not forget he was also Cowboy Curtis - https://www.youtube.com/watch?v=3jsCxNK4vAc 

Lawrence and Samuel aren't the same person....
https://www.youtube.com/watch?v=8Y1o8910Xs4





Sunday, 1 November 2015

Hardware or Software RAID?

About 4-5 years ago when I first made a start on learning and using Linux one of the questions was towards RAID, given you have more than one way to skin a cat so to speak. Which way to skin it?
I was told by a manager (and he was saying this with 100% solidity)"hardware RAID IS the best RAID". - I have yet to see this proven.

Loose Background

Years ago hardware RAID used to be the better option as CPU's were considerably slower so whilst software RAID is constantly running will consume a fair amount of CPU resources (thus additional overhead) combined with the lack of well designed software RAID (or for example firmware RAID on older motherboards) meant you would be better of paying for a dedicated card to handle this as it also has things like BBU + cache so it is able to reorganise write operations prior to flushing to disk at same time keeping writes ready to be flushed even if power is temporarily out to maintain a consistent state.

Questions arised and can be asked such as:
What if the hardware RAID card fails?
If software RAID is improved can we spend less money on HW?
Can rebuilds be done faster through software than hardware RAID?
Perhaps we should integrate LVM/VFS layer together?
Should software RAID be done user space or kernel space?
Is it possible to have software reorganize I/Os like hardware?
What happens to the state of the array if the cache after 72 hours is gone?
etc...

Linux mdadm is quite alot better, you also can use BTRFS or ZFS. I've played around removing drives and rebuilding etc using mdadm. I no longer bother now as I just use ZFS for all my storage needs.

In short Software RAID is now at a stage that it is faster than hardware RAID, provides end-to-end checksumming (so no data corruption), organizing writes to convert random writes into sequential writes (whilst providing dynamic block allocation) and can be very efficient in terms of it's resource usage.
Test that compares software and hardware RAID by Robert - http://milek.blogspot.co.uk/2006/08/hw-raid-vs-zfs-software-raid-part-ii.html
and as referenced also from "Unix and Linux System Administration Handbook fourth edition"

Saturday, 31 October 2015

Microsoft is Evil!

This link is funny

http://toastytech.com/evil/index.html

and on it within the links is my favorite message



from - http://toastytech.com/evil/errwindows.html

you never know, maybe messages like that could exist!