Solaris ZFS Boot on SPARC © 2008 Dusan Baljevic The information contained herein is subject to change without notice Traditional File Systems and ZFS March 19, 2016 Dusan Baljevic 2 ZFS Goals • Pooled storage design makes for easier administration • No need for a Volume Manager • Straightforward commands and a GUI • Snapshots & clones • Quotas and reservations • Compression • Pool migration • ACLs for security • Dynamic striping • Dynamic resilvering • Dynamic resizing March 19, 2016 Dusan Baljevic 3 ZFS Overheads • Compression • Checksum - It has been measured to consume very roughly 1 Ghz of CPU to checksum 500MB/sec of data • RAID-Z As of the Solaris 10 11/06 release, checksum, compression, and RAID-Z parity computation all occur in the context of the thread synching pool data. Under load, that thread may be a performance choke point. It is expected that in future releases all these computations will be done in concurrent threads. Compression is no longer single-threaded via CR 6460622, and is part of the Solaris 10 8/07 release March 19, 2016 Dusan Baljevic 4 Solaris – ZFS Boot Project The ZFS Boot project has been divided into three areas: x86 boot, SPARC boot, and install. The main phases of the project are: • A ZFS plug-in for the GRUB boot loader (available since OpenSolaris snv_62) • Development of a boot loader on SPARC • Enhancements necessary for implementing a ZFS root file systems • Enhancements to the Solaris install and LiveUpgrade features in order to set up and maintain root file systems in ZFS pools March 19, 2016 Dusan Baljevic 5 Solaris ZFS with AVS • • Sun StorageTek Availability Suite (AVS), Remote Mirror Copy and Point-in-Time Copy services, previously known as SNDR (Sun Network Data Replicator) and II (Instant Image), are similar to the Veritas VVR (volume replicator) and Flashsnap (point-in-time copy) products, is currently available in the Solaris Express release SNDR differs from the ZFS send and recv features, which are time-fixed replication features. For example, you can take a point-in-time snapshot, replicate it, or replicate it based on a differential of a prior snapshot. The combination of AVS II and SNDR features, also allows you to perform time-fixed replication. The other modes of the AVS SNDR replication feature allows you to obtain CDP (continuous data replication). ZFS does not currently have this feature March 19, 2016 Dusan Baljevic 6 Solaris ZFS – Third-Party Backups • Sun StorEdge Enterprise Backup Software (Legato Networker 7.3.2 and above) product can fully back up and restore ZFS files including ACLs • EMC Networker version 7.3.2 and above backs up and restores ZFS file systems, including ZFS ACLs • Veritas NetBackup version 6.5 and above can backup and restore ZFS file systems, including ZFS ACLs • IBM Tivoli Storage Manager client software (5.4.1.2) backs up and restores ZFS file systems with both the CLI and the GUI. ZFS ACLs are also preserved • Computer Associates' BrightStor ARCserve product backs up and restores ZFS file systems, but ZFS ACLs are not preserved March 19, 2016 Dusan Baljevic 7 Solaris ZFS and Data Protector Back Agents (Disk Agents) ZFS support: Solaris 10 (including ACL support) 6.0 Released ZFS support: Solaris 10 (excluding ACL support) 5.5 Released Data Protector supports ZFS now. March 19, 2016 Dusan Baljevic 8 ZFS Boot Lab Server HOSTNAME sft3 FQDN sft3.mydom.dom MODEL SUNW,Sun-Fire-T2000 UNAME -A SunOS 5.11 snv_97 sun4v ARCH 64-bit sparcv9 kernel RUN LEVEL 3 PHYSICAL MEMORY 8 GB CPU 32 x 1000MHz virtual CPUs PAGESIZE 8192 bytes VOLUME MANAGER COUNT SINGLE Volume Manager VOLUME MANAGER Zettabyte File System (ZFS) March 19, 2016 Dusan Baljevic 9 Solaris – ZFS Disks ZFS applies an EFI label when creating a storage pool with whole disks. Disks can be labelled with a traditional Solaris VTOC label when creating a storage pool with a disk slice. Slices should only be used under the following conditions: • The device name is non-standard • A single disk is shared between ZFS and another file system, such as UFS • A disk is used as a swap or a dump device • Disk is used for booting the operating system * March 19, 2016 Dusan Baljevic 10 Solaris – EFI Versus VTOC (SMI) Labels EFI is a new disk label that was introduced in the Solaris 9 4/03 Operating System version. The acronym EFI stands for Extensible Firmware Interface and this new label format is REQUIRED for all devices over 1 TB in size, and cannot be converted back to VTOC. • Supported by 64-bit Solaris 9 Operating System (04/03) and above • Provides support for disks greater than 1 TB in size • Provides 7 useable slices - 0 thru 6 - where slice 2 is just another slice • Solaris ZFS uses EFI labels by default • Partitions (or slices) cannot overlap with the primary or backup label, or with any other partitions. The size of the EFI label is usually 34 sectors, so partitions start at sector 34. This feature means no partition can start at sector zero (0) March 19, 2016 Dusan Baljevic 11 Solaris – EFI Versus VTOC (SMI) Labels (continued) • No cylinder or head information is stored in the label. Sizes are reported in sectors and blocks • Information that was stored in the alternate cylinders area, the last two cylinders of the disk, is now stored in slice 8 • With the "format" utility to change partition sizes, the unassigned partition tag is assigned to partitions with sizes equal to zero. By default, the "format" utility assigns the "usr" partition tag to any partition with a size greater than zero. You can use the "partition change" menu to reassign partition tags after the partitions are changed. You cannot change a partition with a non-zero size to the "unassigned" partition tag • EFI labels CAN be written to disks smaller than 1TB, where a standard VTOC label would normally apply, but VTOC cannot be forced onto devices larger than 1TB. This can be done by running "format -e". When you attempt to "label" the disk, you will be asked which type of label is to be written with March 19, 2016 Dusan Baljevic 12 Solaris – EFI Restrictions • The layered software products not intended for systems with EFI-labelled disks might be incapable of accessing a disk with an EFI disk label. (for example, current VxVM versions). A disk with an EFI disk label is not recognized on systems running previous Solaris releases. Adding an EFIlabelled disk to a system that does not support it: Dec 3 09:12:17 <hostname> scsi: WARNING: /sbus@a,0/SUNW,socal@d,10000/sf@1,0/ssd@w50020f23000002a4,0 (ssd1): corrupt label - wrong magic number • Cannot boot from a disk with an EFI disk label • Cannot use the Solaris Management Console Disk Manager Tool to manage disks with EFI labels. Use the "format" utility or the Solaris Management Console Enhanced Storage Tool to manage disks with EFI labels, after you use the 'format' utility to partition the disk March 19, 2016 Dusan Baljevic 13 Solaris – EFI Restrictions (continued) • The EFI specification prohibits overlapping slices. The whole disk is represented by "c#t#d#p6": lrwxrwxrwx 1 root root 70 Nov 25 18:54 c8t1d0p6 -> ../../devices/.../ssd@w50020f23000004aa,0:wd • The following format options are not applicable on disks with EFI labels: o The "save" option is not supported because disks with EFI labels do not need an entry in the "format.dat" file o The "backup" option is not applicable because the disk driver finds the primary label and writes it back to the disk • In the context of SunCluster 3.0/3.1 (refer to Sun Solution 213130): # scgdevs Could not stat: /dev/rdsk/../../devices/pci@1c,600000/scsi@2,1/sd@8,0:h,raw path not loaded. # prtvtoc /dev/did/rdsk/d4s2 prtvtoc: /dev/did/rdsk/d4s2: No such device or address March 19, 2016 Dusan Baljevic 14 EFI Versus VTOC Disk Label * # format -e c0t8d0 selecting c0t8d0 [disk formatted] FORMAT MENU: disk - select a disk type - select (define) a disk type partition - select (define) a partition table current - describe the current disk format - format and analyze the disk repair - repair a defective sector label - write label to the disk analyze - surface analysis defect - defect list management backup - search for backup labels verify - read and display labels save - save new disk/partition definitions inquiry - show vendor, product and revision scsi - independent SCSI mode selects cache - enable, disable or query SCSI disk cache volname - set 8-character volume name !<cmd> - execute <cmd>, then return quit format> l [0] SMI Label [1] EFI Label Specify Label type[1]: 0 Warning: This disk has an EFI label. Changing to SMI label will erase all current partitions. Continue? y Auto configuration via format.dat[no]? y March 19, 2016 Dusan Baljevic 15 Solaris – ZFS Boot Disk Slices Two internal disks were available on SunFire T-2000: c0t0d0 c0t1d0 The following slices were created on c0t0d0: Current partition table (original): Total disk cylinders available: 14087 + 2 (reserved cylinders) Part Tag Flag 0 root wm 1 unassigned wm 2 backup wm 3 unassigned wm 4 unassigned wm 5 unassigned wm 6 unassigned wm 7 unassigned wm March 19, 2016 Cylinders 0 - 14086 0 0 - 14086 0 0 0 0 0 Size Blocks 68.35GB (14087/0/0) 143349312 0 (0/0/0) 0 68.35GB (14087/0/0) 143349312 0 (0/0/0) 0 0 (0/0/0) 0 0 (0/0/0) 0 0 (0/0/0) 0 0 (0/0/0) 0 Dusan Baljevic 16 Solaris – ZFS Recommendations • Run ZFS on a system that runs a 64-bit kernel • 1 GB or more of physical memory is recommended Approximately 64 KB of memory is consumed per mounted ZFS file system. On systems with many ZFS file systems, it is recommended to have 1 GB of extra memory for every 10,000 mounted file systems including snapshots. Be prepared for longer boot times on these systems as well. • The minimum amount of available pool space that is required for a bootable ZFS root file system is larger than for a bootable UFS root file system because swap and dump devices are not shared in a ZFS root environment. In addition, the swap and dump devices are sized at 1/2 the size of RAM, but no more than 2 GB and no less than 512 MB (starting at build 96) • The minimum amount of available pool space for a bootable ZFS root file system depends upon the amount of physical memory, the disk space available, and the number of BEs to be created. Approximately 1 GB of memory and at least 32 GB of disk space are recommended March 19, 2016 Dusan Baljevic 17 Solaris – ZFS Recommendations (continued) • Because ZFS caches data in kernel addressable memory, the kernel sizes will possibly be larger than with other file systems. Configure additional disk-based swap to account for this difference for systems with limited RAM. Use the size of physical memory as an upper bound to the extra amount of swap space that might be required • If possible, do not use slices on the same disk for both swap space and ZFS file systems. Keep the swap areas separate from the ZFS file systems. The best policy is to have enough RAM so that your system does not normally use the swap devices March 19, 2016 Dusan Baljevic 18 Solaris – ZFS Root File System Recommendations Keep the root pool (the pool with the dataset that is allocated for the root file system) separate from pools that are used for data: • Some limitations on root pools exist that you would not want to place on data pools. Mirrored pools and pools with one disk will be supported. RAID-Z or unreplicated pools with more than one disk will not be supported • Data pools can be architecture-neutral. It might make sense to move a data pool between SPARC and Intel. Root pools are pretty much tied to a particular architecture • As a good practice, it is recommended to separate the "personality" of a system from its data March 19, 2016 Dusan Baljevic 19 Solaris ZFS Boot Installation • Solaris Express Community Edition snv_97 SPARC (Solaris 10 Update 6 is the first official release, Nov 2008) • Boot off the DVD Choose Filesystem Type ──────────────────────────────────── Select the filesystem to use for your Solaris installation [] UFS [X] ZFS March 19, 2016 Dusan Baljevic 20 Solaris ZFS Boot Installation (continued) Select Software ─────────────────────────────────────────── Select the Solaris software to install on the system. NOTE: After selecting a software group, you can add or remove software by customizing it. However, this requires understanding of software dependencies and how Solaris software is packaged. [X] Entire Distribution plus OEM support ....... 8527.00 MB [ ] Entire Distribution ........................ 8495.00 MB [ ] Developer System Support ................... 8190.00 MB [ ] End User System Support .................... 6178.00 MB [ ] Core System Support ........................ 907.00 MB [ ] Reduced Networking Core System Support ..... 849.00 MB March 19, 2016 Dusan Baljevic 21 Solaris ZFS Boot Installation (continued) Select Disks ─────────────────────────────────────────── On this screen you must select the disks for installing Solaris software. Start by looking at the Suggested Minimum field; this value is the approximate space needed to install the software you've selected. For ZFS, multiple disks will be configured as mirrors, so the disk you choose, or the slice within the disk must exceed the Suggested Minimum value. NOTE: ** denotes current boot disk Disk Device Available Space =============================================== [X] c0t0d0 69994 MB (F4 to edit) [ ] c0t1d0 69994 MB [ ] c2t50001FE150062328d3 24560 MB [ ] c2t50001FE15006232Cd3 24560 MB March 19, 2016 Maximum Root Size: 69994 MB Baljevic SuggestedDusan Minimum: 8527 MB 22 Solaris ZFS Boot Installation (continued) Specify the name of the pool to be created from the disk(s) you have chosen. Also specify the name of the dataset to be created within the pool that is to be used as the root directory for the filesystem. ZFS Pool Name: rpool ZFS Root Dataset Name: snv_97 ZFS Pool Size (in MB): 69995 Size of Swap Area (in MB): 2048 Size of Dump Area (in MB): 2048 (Pool size must be between 8527 MB and 69995 MB) March 19, 2016 [] Keep / and /var combined [X ] Put /var on a separate dataset Dusan Baljevic 23 Solaris ZFS Boot Installation – First Reboot Rebooting with command: boot Boot device: disk File and args: zfs-file-system Loading: /platform/SUNW,Sun-Fire-T200/boot_archive Loading: /platform/sun4v/boot_archive ramdisk-root ufs-file-system Loading: /platform/SUNW,Sun-Fire-T200/kernel/sparcv9/unix Loading: /platform/sun4v/kernel/sparcv9/unix SunOS Release 5.11 Version snv_97 64-bit Copyright 1983-2008 Sun Microsystems, Inc. All rights reserved. Hostname: sft3 Configuring devices. Loading smf(5) service descriptions: 202/202 Creating new rsa public/private host key pair Creating new dsa public/private host key pair Sep 6 00:23:46 sft3 sendmail[8326]: My unqualified host name (sft3) unknown; sleeping for retry * March 19, 2016 Dusan Baljevic 24 Solaris ZFS Boot Installation – Boot from OpenBoot Prompt (OBP) Rebooting with command: boot -L Boot device: /pci@780/pci@0/pci@9/scsi@0/disk@0 File and args: -L zfs-file-system Loading: /platform/sun4v/bootlst 1 snv_97 Select environment to boot: [ 1 - 1 ] March 19, 2016 Dusan Baljevic 25 Solaris ZFS Boot Installation – First Login # df -F zfs –h Filesystem Mounted size used avail capacity rpool/ROOT/snv_97 67G 6.2G 49G 12% rpool/ROOT/snv_97/var 67G 144M 49G 1% rpool/export /export 67G 21K 49G 1% rpool/export/home /export/home 67G 406K 49G 1% rpool 67G 63K 49G 1% / /var /rpool # uname -a SunOS sft3 5.11 snv_97 sun4v sparc SUNW,Sun-FireT200 March 19, 2016 Dusan Baljevic 26 Solaris ZFS – Legacy vs ZFS (V)fstab # cat /etc/vfstab fd - /dev/fd fd - no - /proc - /proc proc - no - /dev/zvol/dsk/rpool/swap - - swap - no - /devices devfs - no - sharefs - /devices - /etc/dfs/sharetab sharefs - no - ctfs - /system/contract ctfs objfs - /system/object swap - /tmp March 19, 2016 Dusan Baljevic objfs - no - - no - tmpfs - yes - 27 Solaris ZFS Boot – Dump and Swap Separation # dumpadm Dump content: kernel pages Dump device: /dev/zvol/dsk/rpool/dump (dedicated) Savecore directory: /var/crash/sft2k1 Savecore enabled: yes # swap -l swapfile dev swaplo blocks /dev/zvol/dsk/rpool/swap2 253,2 March 19, 2016 Dusan Baljevic 16 free 8388592 8388592 28 Solaris ZFS Pools and Datasets # zfs list NAME USED rpool 18.3G rpool/ROOT 6.29G rpool/ROOT/snv_97 6.29G rpool/ROOT/snv_97/var 144M rpool/dump 8.00G rpool/export 484K rpool/export/home 446K rpool/swap 8G March 19, 2016 AVAIL 48.6G 48.6G 48.6G 48.6G 48.6G 48.6G 48.6G 48.6G Dusan Baljevic REFER MOUNTPOINT 63K /rpool 18K legacy 6.15G / 144M /var 8.00G 21K /export 20K /export/home 16K - 29 Solaris ZFS Boot Installation – Change Size of Root File System # zfs set quota=4000m rpool/ROOT/snv_97 cannot set property for 'rpool/ROOT/snv_97': size is less than current used or reserved space # zfs set quota=8000m rpool/ROOT/snv_97 # df -k / Filesystem rpool/ROOT/snv_97 kbytes used avail capacity Mounted 8192000 6656377 1388629 83% / # zfs set quota=none rpool/ROOT/snv_97 # df -k / Filesystem kbytes used avail capacity Mounted rpool/ROOT/snv_97 70189056 6656377 46604048 13% / March 19, 2016 Dusan Baljevic 30 Solaris ZFS Boot Disk Mirroring (part 1) # prtvtoc /dev/rdsk/c0t0d0s2 | fmthard -s - /dev/rdsk/c0t1d0s2 # zpool attach -f rpool c0t0d0s0 c0t1d0s0 (root pool cannot be raidz yet) # zpool status rpool pool: rpool state: ONLINE status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scrub: resilver in progress for 0h6m, 71.40% done, 0h2m to go config: NAME STATE READ WRITE CKSUM rpool ONLINE 0 0 0 mirror ONLINE 0 0 0 c0t0d0s0 ONLINE 0 0 0 c0t1d0s0 ONLINE 0 0 0 errors: No known data errors March 19, 2016 Dusan Baljevic 31 Solaris ZFS Boot Disk Mirroring (part 2) SPARC # installboot -F zfs /usr/platform/`uname -i`/lib/fs/zfs/bootblk \ /dev/rdsk/c0t1d0s0 X86 # installgrub /boot/grub/stage1 /boot/grub/stage2 /dev/rdsk/c0t1d0s0 # eeprom "use-nvramrc?=true“ # prtconf -pv | grep bootpath bootpath: '/pci@780/pci@0/pci@9/scsi@0/disk@0,0:a‘ # eeprom "nvramrc=devalias zfsroot \ /pci@780/pci@0/pci@9/scsi@0/disk@0,0:a \ devalias zfsmirror /pci@780/pci@0/pci@9/scsi@0/disk@1,0:a“ || \/ format(1) lists this as "sd" March 19, 2016 Dusan Baljevic 32 Solaris ZFS Boot Disk Mirroring (part 3) # eeprom boot-device="zfsroot zfsmirror disk net" # eeprom ttya-rts-dtr-off=false ttya-ignore-cd=true local-mac-address?=true … boot-device=zfsroot zfsmirror disk net use-nvramrc?=true nvramrc=devalias zfsroot /pci@780/pci@0/pci@9/scsi@0/disk@0,0:a devalias zfsmirror /pci@780/pci@0/pci@9/scsi@0/disk@1,0:a security-mode=none … March 19, 2016 Dusan Baljevic 33 Solaris ZFS Boot Installation – Change Primary Swap Device and Its Size (part 1) # zfs set quota=4192m rpool/swap cannot set property for ‘rpool/swap': 'quota' does not apply to datasets of this type. Quotas cannot be set on volumes, as the "volsize" property acts as an implicit quota # swap –l swapfile dev swaplo blocks /dev/zvol/dsk/rpool/swap 253,2 16 March 19, 2016 Dusan Baljevic free 16777200 16777200 34 Solaris ZFS Boot Installation – Change Primary Swap Device and Its Size (part 2) # zfs create -V 4096m rpool/swap2 # zfs list NAME rpool rpool/ROOT rpool/ROOT/snv_97 rpool/ROOT/snv_97/var rpool/dump rpool/export rpool/export/home rpool/swap rpool/swap2 March 19, 2016 USED 26.5G 6.49G 6.49G 144M 8.00G 466K 446K 8G 4G Dusan Baljevic AVAIL 40.4G 40.4G 40.4G 40.4G 40.4G 40.4G 40.4G 48.4G 44.4G REFER 63K 18K 6.35G 144M 8.00G 21K 406K 2.05M 16K MOUNTPOINT /rpool legacy / /var /export /export/home 35 Solaris ZFS Boot Installation – Change Primary Swap Device and Its Size (part 3) # swap -a /dev/zvol/dsk/rpool/swap2 # swap -l swapfile dev /dev/zvol/dsk/rpool/swap 253,2 /dev/zvol/dsk/rpool/swap2 253,3 swaplo blocks free 16 16777200 16777200 16 8388592 8388592 # swap -d /dev/zvol/dsk/rpool/swap # swap -l swapfile dev /dev/zvol/dsk/rpool/swap2 253,3 March 19, 2016 swaplo blocks free 16 8388592 8388592 Dusan Baljevic 36 Solaris ZFS Boot Installation – Change Primary Swap Device and Its Size (part 4) # zfs destroy rpool/swap # zfs list NAME USED rpool 18.5G rpool/ROOT 6.49G rpool/ROOT/snv_97 6.49G rpool/ROOT/snv_97/var 144M rpool/dump 8.00G rpool/export 466K rpool/export/home 446K rpool/swap2 4G March 19, 2016 AVAIL 48.4G 48.4G 48.4G 48.4G 48.4G 48.4G 48.4G 52.4G Dusan Baljevic REFER 63K 18K 6.35G 144M 8.00G 21K 406K 16K MOUNTPOINT /rpool legacy / /var /export /export/home - 37 Solaris ZFS Boot Installation – Change Primary Swap Device and Its Size (part 5) # df -k /tmp Filesystem kbytes used avail capacity Mounted swap 8479424 40 8479384 1% /tmp # grep swap /etc/vfstab /dev/zvol/dsk/rpool/swap - swap - March 19, 2016 Dusan Baljevic - swap /tmp tmpfs - no - - yes - 38 Solaris ZFS – Root Pool Detach Mirror # zpool offline rpool c0t1d0s0 * # zpool detach rpool c0t1d0s0 # zpool status pool: rpool state: ONLINE scrub: none requested config: NAME STATE rpool ONLINE c0t1d0s0 ONLINE READ WRITE CKSUM 0 0 0 0 0 0 errors: No known data errors Two potential problems: • When booting the mirror disk, if the primary disk is online, it will be resilvered with the old data • There is no easy way to access the mirror disk data without rebooting March 19, 2016 Dusan Baljevic 39 Solaris ZFS – Example EFI and VTOC (part 1) # format 0. c7t600508B400102E8E0001500013FB0000d0 <HP-HSV210-6000 cyl 4606 alt 2 hd 128 sec 128> /scsi_vhci/ssd@g600508b400102e8e0001500013fb0000 1. c7t600508B400102E8E0001500014000000d0 <HP-HSV210-6000 cyl 4606 alt 2 hd 128 sec 128> /scsi_vhci/ssd@g600508b400102e8e0001500014000000 format> label [0] SMI Label [1] EFI Label Specify Label type[0]: 1 Warning: This disk has an SMI label. Changing to EFI label will erase all current partitions. Continue? Y March 19, 2016 Dusan Baljevic 40 Solaris ZFS – Example EFI and VTOC (part 2) EFI Label Part Tag 0 usr 1 unassigned 2 unassigned 3 unassigned 4 unassigned 5 unassigned 6 unassigned 7 unassigned 8 reserved March 19, 2016 Flag wm wm wm wm wm wm wm wm wm First Sector 34 0 0 0 0 0 0 0 75481055 Dusan Baljevic Size 35.99GB 0 0 0 0 0 0 0 8.00MB Last Sector 75481054 0 0 0 0 0 0 0 75497438 41 Solaris ZFS – Example EFI and VTOC (part 3) VTOC Label Part Tag 0 root 1 swap 2 backup 3 unassigned 4 unassigned 5 unassigned 6 usr 7 unassigned March 19, 2016 Flag wm wu wu wm wm wm wm wm Cylinders 0 15 16 31 0 - 4605 0 0 0 32 - 4605 0 Size 128.00MB 128.00MB 35.98GB 0 0 0 35.73GB 0 Dusan Baljevic Blocks (16/0/0) 262144 (16/0/0) 262144 (4606/0/0) 75464704 (0/0/0) 0 (0/0/0) 0 (0/0/0) 0 (4574/0/0) 74940416 (0/0/0) 0 42 Solaris ZFS – Example EFI and VTOC (part 4) # zpool create hppool raidz1 \ /dev/dsk/c7t600508B400102E8E0001500014000000d0s0 \ /dev/dsk/c7t600508B400102E8E0001500013FB0000d0s0 \ /dev/dsk/c7t600508B400102E8E0001500013FB0000d0s6 invalid vdev specification use '-f' to override the following errors: raidz contains devices of different sizes # zpool create -f hppool raidz1 /dev/dsk/c7t600508B400102E8E0001500014000000d0s0 \ /dev/dsk/c7t600508B400102E8E0001500013FB0000d0s0 \ /dev/dsk/c7t600508B400102E8E0001500013FB0000d0s6 * March 19, 2016 Dusan Baljevic 43 Solaris ZFS – Example EFI and VTOC (part 5) # zpool list NAME SIZE USED AVAIL CAP HEALTH ALTROOT hppool 370M 158K 370M 0% ONLINE rpool 33.8G 5.35G 28.4G 15% ONLINE # zpool status hppool pool: hppool state: ONLINE scrub: none requested NAME STATE READ WRITE CKSUM hppool ONLINE 0 0 0 raidz1 ONLINE 0 0 0 c7t600508B400102E8E0001500014000000d0s0 ONLINE 0 0 0 c7t600508B400102E8E0001500013FB0000d0s0 ONLINE 0 0 0 c7t600508B400102E8E0001500013FB0000d0s6 ONLINE 0 0 0 errors: No known data errors March 19, 2016 Dusan Baljevic 44 Solaris ZFS – Snapshot and Rollback * # zfs snapshot rpool/export/home@Dusan # zfs list NAME USED AVAIL rpool 22.5G 44.4G rpool/ROOT 6.49G 44.4G rpool/ROOT/snv_97 6.49G 44.4G rpool/ROOT/snv_97/var 144M 44.4G rpool/dump 8.00G 44.4G rpool/export 428K 44.4G rpool/export/home 408K 44.4G rpool/export/home@Dusan 0 rpool/swap 8G # zfs list -o snapdir rpool/export/home@Dusan # rm /export/home/dusan/somefile # zfs rollback rpool/export/home@Dusan March 19, 2016 Dusan Baljevic REFER 63K 18K 6.35G 144M 8.00G 20K 408K 408K 52.4G MOUNT /rpool legacy / /var /export /export/home 2.05M - 45 Solaris ZFS – Delegated Administration # zpool get delegation rpool NAME PROPERTY VALUE SOURCE rpool delegation on default # zfs allow dusan create,destroy,mount,snapshot rpool/export/home # zfs allow rpool/export/home ------------------------------------------------------------Local+Descendent permissions on (rpool/export/home) user dusan create,destroy,mount,snapshot ------------------------------------------------------------- March 19, 2016 Dusan Baljevic 46 Solaris ZFS – Snapshot and Clone # zfs snapshot rpool/export/home@Dusan2 # zfs clone rpool/export/home@Dusan2 rpool/export/clone # zfs list NAME USED AVAIL REFER rpool 22.5G 44.4G 63K rpool/ROOT 6.49G 44.4G 18K rpool/ROOT/snv_97 6.49G 44.4G 6.35G rpool/ROOT/snv_97/var 144M 44.4G 144M rpool/dump 8.00G 44.4G 8.00G rpool/export 466K 44.4G 21K rpool/export/clone 0 44.4G 412K rpool/export/home 446K 44.4G 406K rpool/export/home@Dusan 17K 406K rpool/export/home@Dusan2 22.5K 412K rpool/swap 8G 52.4G 2.05M # rm /export/home/dusan/somefile2 # ls /export/clone/dusan/somefile2 March 19, 2016 Dusan Baljevic MOUNTPOINT /rpool legacy / /var /export /export/clone /export/home - 47 Solaris ZFS Pool Command History # zpool history History for 'rpool': 2008-09-05.22:55:48 zpool create -f -o failmode=continue -R /a -m legacy -o cachefile=/tmp/root/etc/zfs/zpool.cache rpool c0t0d0s0 2008-09-05.22:55:48 zfs set canmount=noauto rpool 2008-09-05.22:55:49 zfs set mountpoint=/rpool rpool 2008-09-05.22:55:49 zfs create -o mountpoint=legacy rpool/ROOT 2008-09-05.22:55:50 zfs create -b 8192 -V 8192m rpool/swap 2008-09-05.22:55:51 zfs create -b 131072 -V 8192m rpool/dump 2008-09-05.22:59:19 zfs create -o canmount=noauto rpool/ROOT/snv_97 2008-09-05.22:59:19 zfs create -o canmount=noauto rpool/ROOT/snv_97/var 2008-09-05.22:59:20 zpool set bootfs=rpool/ROOT/snv_97 rpool 2008-09-05.22:59:20 zfs set mountpoint=/ rpool/ROOT/snv_97 2008-09-05.22:59:21 zfs set canmount=on rpool 2008-09-05.22:59:22 zfs create -o mountpoint=/export rpool/export 2008-09-05.22:59:23 zfs create rpool/export/home 2008-09-06.09:22:47 zfs set quota=8000m rpool/ROOT/snv_97 March 19, 2016 Dusan Baljevic 48 What to Do if ZFS Panics ZFS is designed to survive arbitrary hardware failures through the use of redundancy (mirroring or RAID-Z). However, certain failures in nonreplicated configurations can cause ZFS to panic when trying to load the pool. This is a bug, and will be fixed in the near releases. If you cannot boot due to a corrupt pool, do the following: boot using '-m milestone=none' # mount -o remount / # rm /etc/zfs/zpool.cache # reboot This will remove all knowledge of pools from your system. You will have to re-create your pool and restore from backup. March 19, 2016 Dusan Baljevic 49 Solaris ZFS – Web Management • Web ZFS management (as of Solaris Express Community Release build 28): # /usr/sbin/smcwebserver enable # /usr/sbin/smcwebserver start https://servername:6789/zfs March 19, 2016 Dusan Baljevic 50 Solaris ZFS – Redundant Copies for Data* # zfs create -o copies=1 mypool3/single # zfs create -o copies=2 mypool3/double # zfs create -o copies=3 mypool3/triple # cp -rp /etc /mypool3/single # cp -rp /etc /mypool3/double # cp -rp /etc /mypool3/triple # zfs list -r mypool3 NAME mypool3 mypool3/double mypool3/single mypool3/triple March 19, 2016 USED 48.2M 16.0M 8.09M 23.8M AVAIL 310M 310M 310M 310M REFER 33.5K 16.0M 8.09M 23.8M Dusan Baljevic MOUNTPOINT /mypool3 /mypool3/double /mypool3/single /mypool3/triple 51 Solaris ZFS – Redundant Copies for Data (continued) * March 19, 2016 Dusan Baljevic 52 Solaris ZFS – Redundant Copies for Data (continued) * March 19, 2016 Dusan Baljevic 53 Solaris ZFS and JumpStart (part 1) • For a JumpStart installation, you cannot use an existing ZFS storage pool to create a bootable ZFS root pool. Example: install_type initial_install cluster SUNWCall pool rpool 20G 4g 4g any bootenv installbe bename myBE • You must create your pool with disk slices rather than whole disks. If in the profile you create a pool with whole disks, such as c0t0d0, the installation fails “Invalid disk name (c0t0d0)” March 19, 2016 Dusan Baljevic 54 Solaris ZFS and JumpStart (part2) • Some keywords that are allowed in a UFS specific profile are not allowed in a ZFS specific profile. Examples of those not valid for ZFS root pool: client_arch, client_root, backup_media, archive_location, client_swap, layout_constraint, partitioning, system_type, and so on. More details are in Solaris Installation Guide • You cannot upgrade with JumpStart - must use Solaris Live Upgrade March 19, 2016 Dusan Baljevic 55 Solaris ZFS and JumpStart Examples (part 1) • Mirrored ZFS Root Pool install_type initial_install cluster SUNWCall pool myrootpool auto auto auto mirror c0t0d0s0 c0t1d0s0 bootenv installbe bename solaris10_6 • Customizing the Disk Size For a ZFS Root Pool install_type initial_install cluster SUNWCall pool zfspool 80g 2g 2g mirror any any bootenv installbe bename solaris10_6 March 19, 2016 Dusan Baljevic 56 Solaris ZFS and JumpStart Examples (part 2) • Specifying where to Install the O/S install_type initial_install cluster SUNWCall root_device c0t0d0s0 pool hprootpool auto auto auto rootdisk.s0 bootenv installbe bename mybootname dataset /var March 19, 2016 Dusan Baljevic 57 Solaris ZFS and Live Upgrade (part 1) # lustatus ERROR: No boot environments are configured on this system ERROR: cannot determine list of all boot environment names # lucreate -c BE1 -n BE2 Current boot environment is named <BE1>. Creating initial configuration for primary boot environment <BE1>. The device </dev/dsk/c7t2000002037E35629d0s0> is not a root device for any boot environment; cannot get BE ID. PBE configuration successful: PBE name <BE1> PBE Boot Device </dev/dsk/c7t200000 2037E35629d0s0>. Source boot environment is <BE1>. Creating boot environment <BE2>. Cloning file systems from boot environment <BE1> to create boot environment <BE2>. Creating snapshot for <rpool/ROOT/s10s_u6wos_07b> on <rpool/ROOT/s10s_u6wos_07b@BE2>. Creating clone for <rpool/ROOT/s10s_u6wos_07b@BE2> on <rpool/ROOT/BE2>. Setting canmount=noauto for </> in zone <global> on <rpool/ROOT/BE2>. Creating snapshot for <rpool/ROOT/s10s_u6wos_07b/var> on <rpool/ROOT/s10s_u6wos_07b/var@BE2>. Creating clone for <rpool/ROOT/s10s_u6wos_07b/var@BE2> on <rpool/ROOT/BE2/var>. Setting canmount=noauto for </var> in zone <global> on <rpool/ROOT/BE2/var>. Population of boot environment <BE2> successful. Creation of boot environment <BE2> successful. March 19, 2016 Dusan Baljevic 58 Solaris ZFS and Live Upgrade (part 2) # lustatus Boot Environment Is Active Can Name Complete Now On Reboot Delete Status ---------------- -------- ------- --------- ------ ------ BE1 yes yes yes no - BE2 yes no no yes - March 19, 2016 Active Dusan Baljevic Copy 59 Solaris ZFS and Live Upgrade (part 3) # lufslist BE1 boot environment name: BE1 This boot environment is currently active. This boot environment will be active on next system boot. Filesystem fstype device size Mounted on -------------------------------------- -----------/dev/zvol/dsk/rpool/swap swap 1073741824 rpool/ROOT/s10s_u6wos_07b zfs 4671150080 / rpool/ROOT/s10s_u6wos_07b/var zfs 73081856 /var rpool zfs 6820660736 /rpool rpool/export zfs 38912 /export hppool zfs 124806 /hppool rpool/export/home zfs 18432 /export/home # lufslist BE2 boot environment name: BE2 Filesystem fstype ---------------------------/dev/zvol/dsk/rpool/swap swap rpool/ROOT/BE2 zfs rpool/export zfs rpool/export/home zfs hppool zfs rpool zfs rpool/ROOT/BE2/var zfs March 19, 2016 device size Mounted on ----------- ------------1073741824 104448 / 38912 /export 18432 /export/home 124806 /hppool 6820660736 /rpool ? /var Dusan Baljevic Mount Options ------------- Mount Options ------------60 Solaris ZFS and Live Upgrade (part 4) # lumount BE2 /BE2 # df -F zfs -h Filesystem size used avail capacity Mounted on rpool/ROOT/s10s_u6wos_07b 33G 4.3G 27G 14% / 33G 70M 27G 1% /var rpool/export 33G 20K 27G 1% /export rpool/export/home 33G 18K 27G 1% /export/home rpool 33G 94K 27G 1% /rpool 214M 24K 214M 1% /hppool rpool/ROOT/BE2 33G 4.3G 27G 14% rpool/ROOT/BE2/var 33G 70M 27G 1% rpool/ROOT/s10s_u6wos_07b/var hppool March 19, 2016 Dusan Baljevic /BE2 /BE2/var 61 Solaris ZFS and Encryption Current integration target is build 105 of OpenSolaris (2009.04). Even if Sun management and marketing wanted ZFS Crypto in S10u7 it is not technically possible. To backport ZFS crypto into a Solaris 10 update would require also backporting a large number of bug fixes and projects that were integrated into the Crypto Framework that haven't been yet been backported to Solaris 10. March 19, 2016 Dusan Baljevic 62 Appendix ZFS Pool Properties # zfs get all rpool NAME rpool rpool rpool rpool rpool rpool rpool rpool rpool rpool rpool rpool rpool PROPERTY type creation used available referenced compressratio mounted quota reservation recordsize mountpoint sharenfs checksum March 19, 2016 VALUE filesystem Fri Sep 5 22:55 2008 18.3G 48.6G 63K 1.00x yes none none 128K /rpool off on Dusan Baljevic SOURCE default default default local default default 63 Appendix ZFS Pool Properties (continued) NAME rpool rpool rpool rpool rpool rpool rpool rpool rpool rpool rpool rpool rpool rpool PROPERTY compression atime devices exec setuid readonly zoned snapdir aclmode aclinherit canmount shareiscsi xattr copies March 19, 2016 VALUE off on on on on off off hidden groupmask restricted on off on 1 Dusan Baljevic SOURCE default default default default default default default default default default default default default default 64 Appendix ZFS Pool Properties (continued) NAME rpool rpool rpool rpool rpool rpool rpool rpool rpool rpool rpool rpool PROPERTY copies version utf8only normalization casesensitivity vscan nbmand sharesmb refquota refreservation primarycache secondarycache March 19, 2016 VALUE 1 3 off none sensitive off off off none none all all Dusan Baljevic SOURCE default default default default default default default default 65 Appendix Solaris10 Update 6 * Installation Enhancements o Solaris Installation for ZFS Root Pools * System Administration Enhancements o ZFS Command Improvements and Changes + ZFS installation and boot support + Rolling back a ZFS dataset without unmounting + Enhancements to the zfs send command + ZFS quotas and reservations for file system data only + new ZFS storage pool properties + ZFS command history enhancements + support for upgrading ZFS filesystems + ZFS delegated administration + Setting up separate ZFS logging devices + Creating intermediate ZFS datasets + ZFS hot-plugging enhancements + GZIP compression now available for ZFS + Storing multiple copies of ZFS user data March 19, 2016 Dusan Baljevic 66 Appendix Solaris10 Update 6 (continued) o Solaris Installation Tool Support of ZFS File Systems + Solaris interactive text installer to install a UFS or a ZFS root file system. + Custom JumpStart features to set up a profile to create a ZFS storage pool and designate a bootable ZFS file system. + Migrate a UFS root file system to a ZFS root file system by using the Solaris Live Upgrade feature. + Set up a mirrored ZFS root pool by selecting two disks during the installation. + Automatically create swap and dump devices on ZFS volumes in the ZFS root pool. o SunVTS 7.0 Patch Set 3 o lockstat Provider for DTrace. DTrace lockstat probes that displayed the spin count (spins) now returns spin time in nanoseconds. * System Resource Enhancements o New Solaris Zones Features + Update on Attach. If the new host has the same or later versions of the zone-dependent packages and their associated patches, using zoneadm attach with the -u option, updates those packages within the zone to match the new host.[...] This option also enables automatic migration between machine classes, such as from sun4u to sun4v. + Ability to Set Default Router in Shared-IP Zone + ZFS Zone Path Permitted o x86: New GRUB findroot Command o x64: Support for 256 Processors March 19, 2016 Dusan Baljevic 67 Appendix Solaris10 Update 6 (continued) * System Performance Enhancements o SPARC: Solaris SPARC Boot Architecture Redesigned o x86: Kernel Support for Intel SSSE3, SSE4.1, SSE4.2, and AMD SSE4A * Security Enhancements o Separation of Duty Enforcement Through the Solaris Management Console o SHA256/SHA512 crypt(3C) Plug-in o pam_list Module * Desktop Enhancements o SPARC: Adobe Reader 8.1.2 o Flash Player 9.0.124.0 * Networking Enhancements o Communication Protocol Parser Utilities o SIP End-to-end Traffic Measurements and Logging * Device Management Enhancements o Faulty Device Retirement Feature o MPxIO Support for Hitachi Adaptable Modular Storage Series Arrays March 19, 2016 Dusan Baljevic 68 Appendix Solaris10 Update 6 (continued) * Driver Enhancements o x86: NVIDIA ck804/mcp55 SATA Controller Driver o x86: LSI MegaRAID SAS Controllers Driver o ixgbe Driver. The ixgbe is a 10 Gigabit PCI Express Ethernet driver that supports Intel 82598 10 Gigabit Ethernet controller. o SPARC: Support for aac Driver * Additional Software Enhancements o Perl Database Interface and Perl PostgreSQL Driver o PostgreSQL 8.3 * Language Support Enhancements o IIIMF Hangul Language Engine. The Hangul LE (Language Engine) is a new Korean input method. * Freeware Enhancements o C-URL - The C-URL Wrappers Library o Libidn - Internationalized Domain Library o LibGD - The Graphics Draw Library o TIDY HTML Library March 19, 2016 Dusan Baljevic 69 Appendix Sun Versus NetApp * (part 1) http://www.groklaw.net/article.php?story=20081007160707649&query=NetApp-Sun • • After NetApp filed its lawsuit to halt adoption of Sun's open source ZFS technology, Sun responded by filing re-examination requests with the PTO citing the extensive amount of highly relevant prior art that was not disclosed or considered when NetApp originally filed its patents. The patent office clearly agreed with the relevance of this prior art, as demonstrated by its rejection of the claims across all of the re-examinations. Of these patents, three have been described by NetApp as "core" (US Patent Nos. 6,857,001; 6,892,211; and 5,819,292). Here's the current status of each of them: NetApp Patent No. 6,857,001 - The PTO rejected all 63 claims of the patent based on 10 prior art references provided by Sun. In addition, the trial court has agreed to remove that patent from the litigation for now pending the final reexamination by the PTO. NetApp Patent No. 6,892,211 - The PTO rejected all 24 claims of the patent based on 12 prior art references provided by Sun. There is currently a request pending before the trial judge to stay this patent from the litigation as well. NetApp Patent No. 5,819,292 – The PTO has rejected all of the asserted claims of this patent relying on at least two separate prior art references out of the many provided by Sun. (The examiner felt that to consider the other references would be "redundant".) March 19, 2016 Dusan Baljevic 70 Appendix Sun Versus NetApp (part 2) The Markman Order: • In summary, the court agreed with Sun's interpretation on six of the disputed terms (two of which the court adopted with slight modification) and with NetApp on one. As to the remaining terms, the court either formulated its own interpretation or requested that the parties propose a further construction (i.e. definition). Most significantly, the Court found each of the asserted claims in NetApp's 7,200,715 patent relating to RAID technology to be "indefinite" meaning that someone with experience in this area of technology could not understand the limits of the claimed invention. With regard to NetApp's '715 patent, the court agreed with Sun's position that the claims of the patent are flatly inconsistent with and impossible under the teaching of the patent specification. In effect, unless NetApp appeals and this finding is reversed, the '715 patent is effectively invalidated in this case and against others in the future. • In addition, the Court's findings on the terms "server identification data", "domain name", "portion of a communication" "element of a communication" and "completing a write operation within a local processing node" further strengthen our position that the processors, network interface and systems management software used across NetApp's product line infringe Sun's patents. March 19, 2016 Dusan Baljevic 71 Thank You! © 2008 Dusan Baljevic The information contained herein is subject to change without notice