M3000 / ZFS disk Swap.

Kiss
Keep It Simple Stupid.
Subject:-M3000 ZFS Disk Swap
Revision 0113
Disclaimer.
These notes are written to try help with some tasks.
They are usually text and screen captures of tasks and functions I have done in the field and
while using my lab kit.
I do not say they are correct under all conditions nor do I say they will fix all problems, but
may offer some help.
I just offer them as a suggestion of what I found and did at that time while performing tasks.
I hope they help.
If they do then let people know and copy them out freely.
Thanks and good luck.
Mike
Mike Roberts.
[email protected] (07/01/15)
Page 2
Index.
Notes. ...........................................................................................................4
Access (XSCF) .............................................................................................4
Create an Account. ......................................................................................4
Log in as default and move the keyswitch. ...................................................4
Create the eis-installer login. Set the password and privolages. ......................4
Log out and in as eis-installer. ....................................................................4
Check system. ..........................................................................................5
Showstatus ..............................................................................................5
Connect to the domain. .............................................................................6
Check the disk status. ...............................................................................6
Try and access the disk. ............................................................................7
Its Broken................................................................................................8
Detach the disk. .......................................................................................9
Test the new drive. ................................................................................. 10
Format ANA Read. .................................................................................. 11
Apply a partition table. ............................................................................ 12
Recreate the mirror ................................................................................. 12
Check the system ................................................................................... 13
Wait a while. ............................................................................................. 13
Exit the session. ..................................................................................... 14
Return to the XSCF. ................................................................................ 14
Exit the XSCF session. ............................................................................. 14
Login to the XSCF ................................................................................... 14
Revision History ........................................................................................ 14
Mike Roberts.
[email protected] (07/01/15)
Page 3
Notes.
I was given an M3000 call to replace a faulty ZFS disk and having not seen one
made a few notes.
Access (XSCF)
The M3000 has a single serial port on the back. This is the XSCF.
Create an Account.
To access the system we need an account on the XSCF. It is the same process as
for the rest of the OPL family.
Log in as default and move the keyswitch.
login: default
Change the panel mode switch to Service and press return...
Leave it in that position for at least 5 seconds.
Locked, and press return...
XSCF>
Change the panel mode switch to
Create the eis-installer login. Set the password and privolages.
XSCF>
XSCF> adduser eis-installer
XSCF> password eis-installer
New XSCF password:
Retype new XSCF password:
XSCF> setprivileges eis-installer platadm useradm auditadm fieldeng mode
XSCF> exit
logout
Log out and in as eis-installer.
login:
login: eis-installer
Password:
XSCF>
Mike Roberts.
[email protected] (07/01/15)
Page 4
Check system.
XSCF>
XSCF> showhardconf -u
SPARC Enterprise M3000; Memory_Size:8 GB;
+-----------------------------------+------------+
|
FRU
| Quantity |
+-----------------------------------+------------+
| MBU_A
|
1
|
|
CPU
|
(
1)
|
|
Freq:2.520 GHz;
|
(
1)
|
|
MEM
|
4
|
|
Type:2A; Size:2 GB;
|
(
4)
|
| OPNL
|
1
|
| PSU
|
2
|
| FANBP_B
|
1
|
|
FAN_A
|
2
|
+-----------------------------------+------------+
XSCF> showhardconf
SPARC Enterprise M3000;
+ Serial:PX61004151; Operator_Panel_Switch:Locked;
+ Power_Supply_System:Single; SCF-ID:XSCF#0;
+ System_Power:On; System_Phase:Cabinet Power On;
Domain#0 Domain_Status:Running;
MBU_A Status:Normal; Ver:0101h; Serial:PP100401S4 ;
+ FRU-Part-Number:CA07082-D011 E1
/541-3302-05
;
+ CPU Status:Normal;
+ Freq:2.520 GHz; Type:32;
+ Core:4; Strand:2;
+ Memory_Size:8 GB;
MEM#0A Status:Normal;
+ Code:2c000000000000000818HTF25672PY-667E10100-da545fc4;
+ Type:2A; Size:2 GB;
MEM#1A Status:Normal;
+ Code:2c000000000000000818HTF25672PY-667E10100-da545fc7;
+ Type:2A; Size:2 GB;
MEM#2A Status:Normal;
+ Code:2c000000000000000818HTF25672PY-667E10100-da545f19;
+ Type:2A; Size:2 GB;
MEM#3A Status:Normal;
+ Code:2c000000000000000818HTF25672PY-667E10100-da545fc6;
+ Type:2A; Size:2 GB;
OPNL Status:Normal; Ver:0101h; Serial:PP100404F2 ;
+ FRU-Part-Number:CA07082-D021 A1
/541-3306-01
;
PSU#0 Status:Normal; Serial:EA0951H711;
+ FRU-Part-Number:CA01022-0720 09J /300-2193-09
;
+ Power_Status:On;
+ Type:AC;
PSU#1 Status:Normal; Serial:EA0951H716;
+ FRU-Part-Number:CA01022-0720 09J /300-2193-09
;
+ Power_Status:On;
+ Type:AC;
FANBP_B Status:Normal; Ver:0101h; Serial:PP1004003K ;
+ FRU-Part-Number:CA20399-B12X 009AC/541-3304-03
;
FAN_A#0 Status:Normal;
FAN_A#1 Status:Normal;
XSCF>
Showstatus
Or Show status will show just failed components.
Mike Roberts.
[email protected] (07/01/15)
Page 5
Connect to the domain.
XSCF>
XSCF> console -d 0
Connect to DomainID 0?[y|n] :y
sunH02 console login: root
Password:
Aug 24 13:06:34 sunH02 login: ROOT LOGIN /dev/console
Last login: Fri Jul 16 09:20:43 from sun101.theaa.lo
Sun Microsystems Inc.
SunOS 5.10
Generic January 2005
You have new mail.
Running a ksh..
sunH02:root - /:>
Check the disk status.
sunH02:root - /:>
sunH02:root - /:>zpool list
NAME
SIZE
USED AVAIL
CAP HEALTH ALTROOT
AAHELP
136G 3.59G
132G
2% DEGRADED rpool
136G 13.0G
123G
9% ONLINE sunH02:root - /:>zpool status
pool: AAHELP
state: DEGRADED
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: http://www.sun.com/msg/ZFS-8000-9P
scrub: resilver completed after 0h0m with 0 errors on Wed Aug 18 11:39:56 2010
config:
NAME
AAHELP
mirror
c0t2d0s0
c0t3d0s0
STATE
DEGRADED
DEGRADED
OFFLINE
ONLINE
READ WRITE CKSUM
0
0
0
0
0
0
265
0
2
0
0
0
656M resilvered
errors: No known data errors
pool: rpool
state: ONLINE
scrub: none requested
config:
NAME
rpool
mirror
c0t0d0s0
c0t1d0s0
STATE
ONLINE
ONLINE
ONLINE
ONLINE
READ WRITE CKSUM
0
0
0
0
0
0
0
0
0
0
0
0
errors: No known data errors
sunH02:root - /:>
Mike Roberts.
[email protected] (07/01/15)
Page 6
Try and access the disk.
We try to access it to get the LED to flash and locate the disk.
sunH02:root - /:>
sunH02:root - /:>format c0t2d0
selecting c0t2d0
[disk formatted]
Aug 24 13:07:39 sunH02 scsi: WARNING: /pci@0,600000/pci@0/pci@0/scsi@0/sd@2,0 (sd2):
Aug 24 13:07:39 sunH02
Error for Command: read(10)
Error Level: Retryable
Aug 24 13:07:39 sunH02 scsi:
33
Requested Block: 0
Error Block:
Aug 24 13:07:39 sunH02 scsi:
D0C5P9C006YA
Vendor: FUJITSU
Serial Number:
Aug 24 13:07:39 sunH02 scsi:
Sense Key: Media Error
Aug 24 13:07:39 sunH02 scsi:
ASC: 0x11 (read retries exhausted), ASCQ: 0x1, FRU: 0x0
Aug 24 13:07:41 sunH02 scsi: WARNING: /pci@0,600000/pci@0/pci@0/scsi@0/sd@2,0 (sd2):
Aug 24 13:07:41 sunH02
Error for Command: read(10)
Error Level: Retryable
Aug 24 13:07:41 sunH02 scsi:
426
Requested Block: 256
Error Block:
Aug 24 13:07:41 sunH02 scsi:
D0C5P9C006YA
Vendor: FUJITSU
Serial Number:
sunH02:root - /:>
sunH02:root - /:>
sunH02:root - /:>
sunH02:root - /:>format
Searching for disks...Aug 24 13:09:11 sunH02 scsi: WARNING:
/pci@0,600000/pci@0/pci@0/scsi@0/sd@2,0 (sd2):
Aug 24 13:09:11 sunH02
Informational
Error for Command: read
Error Level:
Aug 24 13:09:11 sunH02 scsi:
Requested Block: 0
Error Block: 0
Aug 24 13:09:11 sunH02 scsi:
D0C5P9C006YA
Vendor: FUJITSU
Serial Number:
Aug 24 13:09:11 sunH02 scsi:
Sense Key: Soft Error
Aug 24 13:09:11 sunH02 scsi:
high), ASCQ: 0x12, FRU: 0x0
ASC: 0x5d (hardware impending failure data error rate too
Mike Roberts.
[email protected] (07/01/15)
Page 7
Its Broken.
We can’t access it so pick another.
sunH02:root - /:>format
AVAILABLE DISK SELECTIONS:
0. c0t0d0 <FUJITSU-MBD2147RC-3701 cyl 14087 alt 2 hd
/pci@0,600000/pci@0/pci@0/scsi@0/sd@0,0
1. c0t1d0 <FUJITSU-MBD2147RC-3701 cyl 14087 alt 2 hd
/pci@0,600000/pci@0/pci@0/scsi@0/sd@1,0
2. c0t2d0 <FUJITSU-MBD2147RC-3701 cyl 14087 alt 2 hd
/pci@0,600000/pci@0/pci@0/scsi@0/sd@2,0
3. c0t3d0 <FUJITSU-MBD2147RC-3701 cyl 14087 alt 2 hd
/pci@0,600000/pci@0/pci@0/scsi@0/sd@3,0
Specify disk (enter its number): 3
selecting c0t3d0
[disk formatted]
/dev/dsk/c0t3d0s0 is part of active ZFS pool AAHELP. Please
/dev/dsk/c0t3d0s2 is part of active ZFS pool AAHELP. Please
FORMAT MENU:
disk
type
partition
current
format
repair
label
analyze
defect
backup
verify
save
inquiry
volname
!<cmd>
quit
format> ana
-
24 sec 848>
24 sec 848>
24 sec 848>
24 sec 848>
see zpool(1M).
see zpool(1M).
select a disk
select (define) a disk type
select (define) a partition table
describe the current disk
format and analyze the disk
repair a defective sector
write label to the disk
surface analysis
defect list management
search for backup labels
read and display labels
save new disk/partition definitions
show vendor, product and revision
set 8-character volume name
execute <cmd>, then return
ANALYZE MENU:
read
- read only test
(doesn't harm
refresh - read then write (doesn't harm
test
- pattern testing (doesn't harm
write
- write then read
(corrupts
compare - write, read, compare (corrupts
purge
- write, read, write
(corrupts
verify
- write entire disk, then verify
print
- display data buffer
setup
- set analysis parameters
config
- show analysis parameters
!<cmd>
- execute <cmd> , then return
quit
analyze> read
Ready to analyze (won't harm SunOS). This takes a
but is interruptable with CTRL-C. Continue? y
SunOS)
data)
data)
data)
data)
data)
(corrupts data)
long time,
pass 0
0/0/0
0/0/126
0/0/252
0/0/378
0/0/504
0/0/630
0/0/756
.
.
.
Mike Roberts.
[email protected] (07/01/15)
Page 8
sunH02:root - /:>^C
This will have made a LED flash and help identify the location.
Detach the disk.
To replace the disk we detached the mirror.
NB I could have done the work and used # zpool replace AAHELP c0t2d0
sunH02:root - /:>
sunH02:root - /:>zpool detach AAHELP c0t2d0s0
sunH02:root - /:>zpool status AAHELP
pool: AAHELP
state: ONLINE
scrub: resilver completed after 0h0m with 0 errors on Wed Aug 18 11:39:56 2010
config:
NAME
AAHELP
c0t3d0s0
STATE
ONLINE
ONLINE
READ WRITE CKSUM
0
0
0
0
0
0
errors: No known data errors
sunH02:root - /:>^
Then we unconfigure it, to stop Solaris Accessing it.
sunH02:root - /:>^
sunH02:root - /:>cfgadm -al
Ap_Id
Type
Receptacle
SB0
System_Brd
connected
SB0::cpu0
cpu
connected
SB0::memory
memory
connected
SB0::pci0
io
connected
SB0::pci1
io
connected
SB0::pci8
io
connected
c0
scsi-bus
connected
c0::dsk/c0t0d0
disk
connected
c0::dsk/c0t1d0
disk
connected
c0::dsk/c0t2d0
disk
connected
c0::dsk/c0t3d0
disk
connected
c0::dsk/c0t4d0
CD-ROM
connected
sunH02:root - /:>cfgadm -c unconfigure c0::dsk/c0t2d0
sunH02:root - /:>
sunH02:root - /:>cfgadm -al
Ap_Id
SB0
SB0::cpu0
SB0::memory
SB0::pci0
SB0::pci1
SB0::pci8
c0
c0::dsk/c0t0d0
c0::dsk/c0t1d0
c0::dsk/c0t2d0
c0::dsk/c0t3d0
c0::dsk/c0t4d0
sunH02:root - /:>
Type
System_Brd
cpu
memory
io
io
io
scsi-bus
disk
disk
disk
disk
CD-ROM
Receptacle
connected
connected
connected
connected
connected
connected
connected
connected
connected
connected
connected
connected
Occupant
configured
configured
configured
configured
configured
configured
configured
configured
configured
configured
configured
configured
Condition
ok
ok
ok
ok
ok
ok
unknown
unknown
unknown
unknown
unknown
unknown
Occupant
configured
configured
configured
configured
configured
configured
configured
configured
configured
unconfigured
configured
configured
Condition
ok
ok
ok
ok
ok
ok
unknown
unknown
unknown
unknown
unknown
unknown
We now replace the hardware (Just pull).
Mike Roberts.
[email protected] (07/01/15)
Page 9
Then configure it back for Solaris to use.
sunH02:root - /:>
sunH02:root - /:>cfgadm -c configure c0::dsk/c0t2d0
Ap_Id
Type
Receptacle
SB0
System_Brd
connected
SB0::cpu0
cpu
connected
SB0::memory
memory
connected
SB0::pci0
io
connected
SB0::pci1
io
connected
SB0::pci8
io
connected
c0
scsi-bus
connected
c0::dsk/c0t0d0
disk
connected
c0::dsk/c0t1d0
disk
connected
c0::dsk/c0t2d0
disk
connected
c0::dsk/c0t3d0
disk
connected
c0::dsk/c0t4d0
CD-ROM
connected
sunH02:root - /:>
Occupant
configured
configured
configured
configured
configured
configured
configured
configured
configured
configured
configured
configured
Condition
ok
ok
ok
ok
ok
ok
unknown
unknown
unknown
unknown
unknown
unknown
Test the new drive.
We check the partition table to see if EFI or SMI
sunH02:root - /:>
sunH02:root - /:>format
Searching for disks...done
AVAILABLE DISK SELECTIONS:
0. c0t0d0 <FUJITSU-MBD2147RC-3701 cyl 14087
/pci@0,600000/pci@0/pci@0/scsi@0/sd@0,0
1. c0t1d0 <FUJITSU-MBD2147RC-3701 cyl 14087
/pci@0,600000/pci@0/pci@0/scsi@0/sd@1,0
2. c0t2d0 <FUJITSU-MBD2147RC-3702 cyl 14087
/pci@0,600000/pci@0/pci@0/scsi@0/sd@2,0
3. c0t3d0 <FUJITSU-MBD2147RC-3701 cyl 14087
/pci@0,600000/pci@0/pci@0/scsi@0/sd@3,0
Specify disk (enter its number): 2
selecting c0t2d0
[disk formatted]
FORMAT MENU:
disk
type
partition
current
format
repair
label
analyze
defect
backup
verify
save
inquiry
volname
!<cmd>
quit
format> p
PARTITION MENU:
0
1
2
3
-
-
alt 2 hd 24 sec 848>
alt 2 hd 24 sec 848>
alt 2 hd 24 sec 848>
alt 2 hd 24 sec 848>
select a disk
select (define) a disk type
select (define) a partition table
describe the current disk
format and analyze the disk
repair a defective sector
write label to the disk
surface analysis
defect list management
search for backup labels
read and display labels
save new disk/partition definitions
show vendor, product and revision
set 8-character volume name
execute <cmd>, then return
change
change
change
change
`0'
`1'
`2'
`3'
partition
partition
partition
partition
Mike Roberts.
[email protected] (07/01/15)
Page 10
4
- change `4' partition
5
- change `5' partition
6
- change `6' partition
7
- change `7' partition
select - select a predefined table
modify - modify a predefined partition table
name
- name the current table
print - display the current table
label - write partition map and label to the disk
!<cmd> - execute <cmd>, then return
quit
partition> p
Current partition table (original):
Total disk cylinders available: 14087 + 2 (reserved cylinders)
Part
Tag
0
root
1
swap
2
backup
3 unassigned
4 unassigned
5 unassigned
6
usr
7 unassigned
Flag
wm
wu
wu
wm
wm
wm
wm
wm
Cylinders
0 12
13 25
0 - 14086
0
0
0
26 - 14086
0
Size
129.19MB
129.19MB
136.71GB
0
0
0
136.46GB
0
Blocks
(13/0/0)
264576
(13/0/0)
264576
(14087/0/0) 286698624
(0/0/0)
0
(0/0/0)
0
(0/0/0)
0
(14061/0/0) 286169472
(0/0/0)
0
partition> q
FORMAT MENU:
disk
type
partition
current
format
repair
label
analyze
defect
backup
verify
save
inquiry
volname
!<cmd>
quit
format> q
sunH02:root - /:>
-
select a disk
select (define) a disk type
select (define) a partition table
describe the current disk
format and analyze the disk
repair a defective sector
write label to the disk
surface analysis
defect list management
search for backup labels
read and display labels
save new disk/partition definitions
show vendor, product and revision
set 8-character volume name
execute <cmd>, then return
Format ANA Read.
To test the device.
sunH02:root - /:>
sunH02:root - /:>format
Searching for disks...done
AVAILABLE DISK SELECTIONS:
0. c0t0d0 <FUJITSU-MBD2147RC-3701 cyl 14087
/pci@0,600000/pci@0/pci@0/scsi@0/sd@0,0
1. c0t1d0 <FUJITSU-MBD2147RC-3701 cyl 14087
/pci@0,600000/pci@0/pci@0/scsi@0/sd@1,0
2. c0t2d0 <FUJITSU-MBD2147RC-3702 cyl 14087
/pci@0,600000/pci@0/pci@0/scsi@0/sd@2,0
3. c0t3d0 <FUJITSU-MBD2147RC-3701 cyl 14087
/pci@0,600000/pci@0/pci@0/scsi@0/sd@3,0
Specify disk (enter its number): 2
selecting c0t2d0
Mike Roberts.
alt 2 hd 24 sec 848>
alt 2 hd 24 sec 848>
alt 2 hd 24 sec 848>
alt 2 hd 24 sec 848>
[email protected] (07/01/15)
Page 11
[disk formatted]
FORMAT MENU:
disk
type
partition
current
format
repair
label
analyze
defect
backup
verify
save
inquiry
volname
!<cmd>
quit
format> ana
-
select a disk
select (define) a disk type
select (define) a partition table
describe the current disk
format and analyze the disk
repair a defective sector
write label to the disk
surface analysis
defect list management
search for backup labels
read and display labels
save new disk/partition definitions
show vendor, product and revision
set 8-character volume name
execute <cmd>, then return
ANALYZE MENU:
read
- read only test
(doesn't harm
refresh - read then write (doesn't harm
test
- pattern testing (doesn't harm
write
- write then read
(corrupts
compare - write, read, compare (corrupts
purge
- write, read, write
(corrupts
verify
- write entire disk, then verify
print
- display data buffer
setup
- set analysis parameters
config
- show analysis parameters
!<cmd>
- execute <cmd> , then return
quit
analyze> read
Ready to analyze (won't harm SunOS). This takes a
but is interruptable with CTRL-C. Continue? y
SunOS)
data)
data)
data)
data)
data)
(corrupts data)
long time,
pass 0
0/0/0
0/0/126
0/0/252
0/0/378
0/0/504
0/0/630
0/0/756
Apply a partition table.
Now put the partition table back. Since its a mirror of c0t3d0 we can just copy that
over.
sunH02:root - /:>
sunH02:root - /:>prtvtoc /dev/rdsk/c0t3d0s2 | fmthard -s - /dev/rdsk/c0t2d0s2
sunH02:root - /:>
Recreate the mirror
sunH02:root - /:>
sunH02:root - /:>zpool attach -f AAHELP c0t3d0s2 c0t2d0s0
sunH02:root - /:>
Mike Roberts.
[email protected] (07/01/15)
Page 12
Check the system
sunH02:root - /:>
sunH02:root - /:>zpool list
NAME
SIZE
USED AVAIL
CAP HEALTH ALTROOT
AAHELP
136G 3.56G
132G
2% ONLINE rpool
136G 12.9G
123G
9% ONLINE sunH02:root - /:>zpool status
pool: AAHELP
state: ONLINE
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scrub: resilver in progress for 0h0m, 32.09% done, 0h0m to go
config:
NAME
AAHELP
mirror
c0t3d0s0
c0t2d0s0
STATE
ONLINE
ONLINE
ONLINE
ONLINE
READ WRITE CKSUM
0
0
0
0
0
0
0
0
0
0
0
0
1.14G resilvered
errors: No known data errors
pool: rpool
state: ONLINE
scrub: none requested
config:
NAME
rpool
mirror
c0t0d0s0
c0t1d0s0
STATE
ONLINE
ONLINE
ONLINE
ONLINE
READ WRITE CKSUM
0
0
0
0
0
0
0
0
0
0
0
0
errors: No known data errors
sunH02:root - /:>
Wait a while.
Then check the system.
sunH02:root - /:>
sunH02:root - /:>zpool status
pool: AAHELP
state: ONLINE
scrub: resilver completed after 0h1m with 0 errors on Tue Aug 24 13:23:08 2010
config:
NAME
AAHELP
mirror
c0t3d0s0
c0t2d0s0
STATE
ONLINE
ONLINE
ONLINE
ONLINE
READ WRITE CKSUM
0
0
0
0
0
0
0
0
0
0
0
0
3.56G resilvered
errors: No known data errors
pool: rpool
state: ONLINE
scrub: none requested
config:
NAME
rpool
mirror
c0t0d0s0
c0t1d0s0
STATE
ONLINE
ONLINE
ONLINE
ONLINE
READ WRITE CKSUM
0
0
0
0
0
0
0
0
0
0
0
0
Mike Roberts.
[email protected] (07/01/15)
Page 13
errors: No known data errors
sunH02:root - /:>
Exit the session.
sunH02:root - /:>
sunH02:root - /:>exit
sunH02 console login:
Return to the XSCF.
sunH02 console login:
sunH02 console login: exit from console.
XSCF>
Exit the XSCF session.
XSCF>
XSCF> exit
logout
login:
Login to the XSCF
Login to the XSCF as default and remove the eis-installer login.
login:
login: default
Change the panel mode switch to Service and press return...
Leave it in that position for at least 5 seconds. Change the panel mode switch to
Locked, and press return...
XSCF> deleteuser eis-installer
XSCF> exit
logout
login: ~
Revision History
Revision
Date
Reason
0113
Jan 2013
Original document.
Mike Roberts.
[email protected] (07/01/15)
Page 14