Discussion:
[ceph-users] ceph-iscsi iSCSI Login negotiation failed
Steven Vacaroaia
2018-12-05 15:43:14 UTC
Permalink
Hi,
I have a strange issue
I configured 2 identical iSCSI gateways but one of them is complaining
about negotiations although gwcli reports the correct auth and status (
logged-in)

Any help will be truly appreciated

Here are some details

ceph-iscsi-config-2.6-42.gccca57d.el7.noarch
ceph-iscsi-cli-2.7-54.g9b18a3b.el7.noarch
tcmu-runner-1.4.0-1.el7.x86_64
python-rtslib-2.1.fb67-10.g7713d1e.noarch

no errors in rbd-target-gw or api logs

/var/log/messages
Dec 2 03:46:10 osd01 kernel: iSCSI Login timeout on Network Portal
10.10.35.201:3260
Dec 2 03:46:10 osd01 kernel: tx_data returned -32, expecting 48.
Dec 2 03:46:10 osd01 kernel: iSCSI Login negotiation failed

gwcli
/iscsi-target...nner-21faa413> info
Client Iqn .. iqn.1998-01.com.vmware:banner-21faa413
Ip Address .. 10.10.35.77
Alias ..
Logged In .. LOGGED_IN
Auth
- chap .. cephuser/I8well4sure_
Group Name ..
Luns
- rbd.rep01 .. lun_id=0
- rbd.vmware01 .. lun_id=1

gwcli ls
o- /
.........................................................................................................................
[...]
o- clusters
........................................................................................................
[Clusters: 1]
| o- ceph
..........................................................................................................
[HEALTH_WARN]
| o- pools
..........................................................................................................
[Pools: 4]
| | o- cephfs_data .................................................
[(x2), Commit: 0.00Y/8378397184K (0%), Used: 333997734460b]
| | o- cephfs_metadata
.................................................. [(x2), Commit:
0.00Y/1445234M (0%), Used: 1327016659b]
| | o- rbd ......................................................
[(x2), Commit: 11.0T/8378397184K (140%), Used: 6083151607799b]
| | o- scbench ......................................................
[(x2), Commit: 0.00Y/8378397184K (0%), Used: 83810582552b]
| o- topology
...............................................................................................
[OSDs: 20,MONs: 3]
o- disks
.......................................................................................................
[11.0T, Disks: 2]
| o- rbd.rep01
....................................................................................................
[rep01 (5.0T)]
| o- rbd.vmware01
..............................................................................................
[vmware01 (6.0T)]
o- iscsi-target
.....................................................................................................
[Targets: 1]
o- iqn.2003-01.com.redhat.iscsi-gw:chicago-ceph
..................................................................
[Gateways: 2]
o- gateways
............................................................................................
[Up: 2/2, Portals: 2]
| o- osd01
...............................................................................................
[10.10.35.201 (UP)]
| o- osd02
...............................................................................................
[10.10.35.202 (UP)]
o- host-groups
..................................................................................................
[Groups : 0]
o- hosts
..............................................................................................
[Hosts: 7: Auth: CHAP]
o- iqn.1998-01.com.vmware:pilot9-0a779e5f
......................................... [LOGGED-IN, Auth: CHAP, Disks:
2(11.0T)]
| o- lun 0
.................................................................................
[rbd.rep01(5.0T), Owner: osd02]
| o- lun 1
..............................................................................
[rbd.vmware01(6.0T), Owner: osd01]
o- iqn.1998-01.com.vmware:vsan2-096f2c78
.......................................... [LOGGED-IN, Auth: CHAP, Disks:
2(11.0T)]
| o- lun 0
.................................................................................
[rbd.rep01(5.0T), Owner: osd02]
| o- lun 1
..............................................................................
[rbd.vmware01(6.0T), Owner: osd01]
o- iqn.1998-01.com.vmware:vsan3-6197c943
.......................................... [LOGGED-IN, Auth: CHAP, Disks:
2(11.0T)]
| o- lun 0
.................................................................................
[rbd.rep01(5.0T), Owner: osd02]
| o- lun 1
..............................................................................
[rbd.vmware01(6.0T), Owner: osd01]
o- iqn.1998-01.com.vmware:banner-21faa413
......................................... [LOGGED-IN, Auth: CHAP, Disks:
2(11.0T)]
| o- lun 0
.................................................................................
[rbd.rep01(5.0T), Owner: osd02]
| o- lun 1
..............................................................................
[rbd.vmware01(6.0T), Owner: osd01]
o- iqn.1998-01.com.vmware:pilot10-0d88ab21
........................................ [LOGGED-IN, Auth: CHAP, Disks:
2(11.0T)]
| o- lun 0
.................................................................................
[rbd.rep01(5.0T), Owner: osd02]
| o- lun 1
..............................................................................
[rbd.vmware01(6.0T), Owner: osd01]
o- iqn.1998-01.com.vmware:vsan1-7bb6d7ac
.......................................... [LOGGED-IN, Auth: CHAP, Disks:
2(11.0T)]
| o- lun 0
.................................................................................
[rbd.rep01(5.0T), Owner: osd02]
| o- lun 1
..............................................................................
[rbd.vmware01(6.0T), Owner: osd01]
o- iqn.1998-01.com.vmware:vsan4-26874f81
.......................................... [LOGGED-IN, Auth: CHAP, Disks:
2(11.0T)]
o- lun 0
.................................................................................
[rbd.rep01(5.0T), Owner: osd02]
o- lun 1
..............................................................................
[rbd.vmware01(6.0T), Owner: osd01]

targetcli ls
o- /
.........................................................................................................................
[...]
o- backstores
..............................................................................................................
[...]
| o- block
..................................................................................................
[Storage Objects: 0]
| o- fileio
.................................................................................................
[Storage Objects: 0]
| o- pscsi
..................................................................................................
[Storage Objects: 0]
| o- ramdisk
................................................................................................
[Storage Objects: 0]
| o- user:glfs
..............................................................................................
[Storage Objects: 0]
| o- user:qcow
..............................................................................................
[Storage Objects: 0]
| o- user:rbd
...............................................................................................
[Storage Objects: 2]
| | o- rbd.rep01
..............................................................
[rbd/rep01;osd_op_timeout=30 (750.0GiB) activated]
| | | o- alua
...................................................................................................
[ALUA Groups: 3]
| | | o- ano1
...............................................................................
[ALUA state: Active/non-optimized]
| | | o- ao
.....................................................................................
[ALUA state: Active/optimized]
| | | o- default_tg_pt_gp
.......................................................................
[ALUA state: Active/optimized]
| | o- rbd.vmware01
..........................................................
[rbd/vmware01;osd_op_timeout=30 (6.0TiB) activated]
| | o- alua
...................................................................................................
[ALUA Groups: 3]
| | o- ano2
...............................................................................
[ALUA state: Active/non-optimized]
| | o- ao
.....................................................................................
[ALUA state: Active/optimized]
| | o- default_tg_pt_gp
.......................................................................
[ALUA state: Active/optimized]
| o- user:zbc
...............................................................................................
[Storage Objects: 0]
o- iscsi
............................................................................................................
[Targets: 1]
| o- iqn.2003-01.com.redhat.iscsi-gw:chicago-ceph
......................................................................
[TPGs: 2]
| o- tpg1
..........................................................................................
[no-gen-acls, auth per-acl]
| | o- acls
..........................................................................................................
[ACLs: 7]
| | | o- iqn.1998-01.com.vmware:banner-21faa413
................................................... [1-way auth, Mapped
LUNs: 2]
| | | | o- mapped_lun0
..............................................................................
[lun0 user/rbd.rep01 (rw)]
| | | | o- mapped_lun1
...........................................................................
[lun1 user/rbd.vmware01 (rw)]
| | | o- iqn.1998-01.com.vmware:pilot10-0d88ab21
.................................................. [1-way auth, Mapped
LUNs: 2]
| | | | o- mapped_lun0
..............................................................................
[lun0 user/rbd.rep01 (rw)]
| | | | o- mapped_lun1
...........................................................................
[lun1 user/rbd.vmware01 (rw)]
| | | o- iqn.1998-01.com.vmware:pilot9-0a779e5f
................................................... [1-way auth, Mapped
LUNs: 2]
| | | | o- mapped_lun0
..............................................................................
[lun0 user/rbd.rep01 (rw)]
| | | | o- mapped_lun1
...........................................................................
[lun1 user/rbd.vmware01 (rw)]
| | | o- iqn.1998-01.com.vmware:vsan1-7bb6d7ac
.................................................... [1-way auth, Mapped
LUNs: 2]
| | | | o- mapped_lun0
..............................................................................
[lun0 user/rbd.rep01 (rw)]
| | | | o- mapped_lun1
...........................................................................
[lun1 user/rbd.vmware01 (rw)]
| | | o- iqn.1998-01.com.vmware:vsan2-096f2c78
.................................................... [1-way auth, Mapped
LUNs: 2]
| | | | o- mapped_lun0
..............................................................................
[lun0 user/rbd.rep01 (rw)]
| | | | o- mapped_lun1
...........................................................................
[lun1 user/rbd.vmware01 (rw)]
| | | o- iqn.1998-01.com.vmware:vsan3-6197c943
.................................................... [1-way auth, Mapped
LUNs: 2]
| | | | o- mapped_lun0
..............................................................................
[lun0 user/rbd.rep01 (rw)]
| | | | o- mapped_lun1
...........................................................................
[lun1 user/rbd.vmware01 (rw)]
| | | o- iqn.1998-01.com.vmware:vsan4-26874f81
.................................................... [1-way auth, Mapped
LUNs: 2]
| | | o- mapped_lun0
..............................................................................
[lun0 user/rbd.rep01 (rw)]
| | | o- mapped_lun1
...........................................................................
[lun1 user/rbd.vmware01 (rw)]
| | o- luns
..........................................................................................................
[LUNs: 2]
| | | o- lun0
..........................................................................................
[user/rbd.rep01 (ano1)]
| | | o- lun1
.........................................................................................
[user/rbd.vmware01 (ao)]
| | o- portals
....................................................................................................
[Portals: 1]
| | o- 10.10.35.201:3260
................................................................................................
[OK]
| o- tpg2
...........................................................................................................
[disabled]
| o- acls
..........................................................................................................
[ACLs: 0]
| o- luns
..........................................................................................................
[LUNs: 2]
| | o- lun0
............................................................................................
[user/rbd.rep01 (ao)]
| | o- lun1
.......................................................................................
[user/rbd.vmware01 (ano2)]
| o- portals
....................................................................................................
[Portals: 1]
| o- 10.10.35.202:3260
................................................................................................
[OK]
o- loopback
.........................................................................................................
[Targets: 0]
o- vhost
............................................................................................................
[Targets: 0]
o- xen-pvscsi
.......................................................................................................
[Targets: 0]
Mike Christie
2018-12-05 16:47:18 UTC
Permalink
Post by Steven Vacaroaia
Hi,
I have a strange issue
I configured 2 identical iSCSI gateways but one of them is complaining
about negotiations although gwcli reports the correct auth and status (
logged-in)
Any help will be truly appreciated
Here are some details
ceph-iscsi-config-2.6-42.gccca57d.el7.noarch
ceph-iscsi-cli-2.7-54.g9b18a3b.el7.noarch
tcmu-runner-1.4.0-1.el7.x86_64
python-rtslib-2.1.fb67-10.g7713d1e.noarch
no errors in rbd-target-gw or api logs
/var/log/messages
Dec 2 03:46:10 osd01 kernel: iSCSI Login timeout on Network Portal
10.10.35.201:3260 <http://10.10.35.201:3260>
Dec 2 03:46:10 osd01 kernel: tx_data returned -32, expecting 48.
Dec 2 03:46:10 osd01 kernel: iSCSI Login negotiation failed
Is this just on the initial discovery and login with vmware initiators
or does it happen repeatedly? Is 10.10.35.201 also the IP address you
set for the discovery address in the vsphere gui?

If you are seeing this error over and over and also around the same time
in /var/log/tcmu-runner.log or the /var/log/messages on the target side
logs you are seeing cmd timeout errors or something about disabling
iscsi tpgs/ports then this means tcmu-runner is not able execute
commands within 30 seconds. That is normally due to a bad connection to
the cluster, something going wrong on the OSD/Mons, or some big slowdown
on the OSDs or runner not detecting they went down quick enough.
Steven Vacaroaia
2018-12-05 17:01:28 UTC
Permalink
Thanks for taking the trouble to respond

I noticed some xfs error on the /var partition so I have rebooted the
server in order to force xfs_repair to run

It is now working

Steven
Post by Mike Christie
Post by Steven Vacaroaia
Hi,
I have a strange issue
I configured 2 identical iSCSI gateways but one of them is complaining
about negotiations although gwcli reports the correct auth and status (
logged-in)
Any help will be truly appreciated
Here are some details
ceph-iscsi-config-2.6-42.gccca57d.el7.noarch
ceph-iscsi-cli-2.7-54.g9b18a3b.el7.noarch
tcmu-runner-1.4.0-1.el7.x86_64
python-rtslib-2.1.fb67-10.g7713d1e.noarch
no errors in rbd-target-gw or api logs
/var/log/messages
Dec 2 03:46:10 osd01 kernel: iSCSI Login timeout on Network Portal
10.10.35.201:3260 <http://10.10.35.201:3260>
Dec 2 03:46:10 osd01 kernel: tx_data returned -32, expecting 48.
Dec 2 03:46:10 osd01 kernel: iSCSI Login negotiation failed
Is this just on the initial discovery and login with vmware initiators
or does it happen repeatedly? Is 10.10.35.201 also the IP address you
set for the discovery address in the vsphere gui?
If you are seeing this error over and over and also around the same time
in /var/log/tcmu-runner.log or the /var/log/messages on the target side
logs you are seeing cmd timeout errors or something about disabling
iscsi tpgs/ports then this means tcmu-runner is not able execute
commands within 30 seconds. That is normally due to a bad connection to
the cluster, something going wrong on the OSD/Mons, or some big slowdown
on the OSDs or runner not detecting they went down quick enough.
Loading...