[#VSM-51] Install Fails for VSM 0.8.0 Engineering Build

advertisement
[VSM-51] Install Fails for VSM 0.8.0 Engineering Build Release Created:
06/Dec/14 Updated: 27/Oct/15 Resolved: 25/Jan/15
Status:
Project:
Component/s:
Affects
Version/s:
Fix Version/s:
Resolved
Virtual Storage Manager
Usability
1.1
Type:
Reporter:
Resolution:
Labels:
Remaining
Estimate:
Time Spent:
Original
Estimate:
Bug
Ferber Dan
Fixed
None
Not Specified
1.1
Priority:
Assignee:
Votes:
P1
Wang Yaguang
0
Not Specified
Not Specified
Attachments:
api-paste-ini.txt
cluster.manifest.txt
etc.hosts.txt
httpd_logs.tar
netstat-info.txt
ps-info.txt
rpms_list.txt
vsf-conf.txt
vsm-api-log-2.txt
vsm-api-log.txt
vsm-conductorlog.txt
vsm-scheduler-log.txt
Major
Issue Severity:
CentOS
OS Name:
6.5 Basic Server
OS Version:
Reproducibility: Always (100%)
Description
After following the instructions in INSTALL.md (which are excellent by the way) in the 0.8.0
released engineering package, everything installs OK and the vsm-controller seems to start
successfully, and with no errors. iptables are OFF and SELinux is disabled.
However, when I try and connect my browser to the vsmconsole after a reboot, I first get the
standard dashboard username/password prompt. But then when I enter username and password,
I get "Something went wrong! An unexpected error has occurred. Try using the "back" button to
return to the previous page, or contact your local administrator." and every time after that I get
this same message when I connect to the dashboard, until next reboot, when the cycle repeats.
I've attached all of the log files I can think of plus some system info as well.
Comments
Comment by Wang Yaguang [ 07/Dec/14 ]
there are errors in vsm-api-log.txt:
2014-12-06 21:28:43 ERROR [vsm.manifest.parser] Format error, at least, need
setstorage_group_near_full_threshold\
2014-12-06 21:28:43 ERROR [vsm.manifest.parser] Format error, at least, need
setstorage_group_full_threshold\
could you paste the cluster.manifest also?
Comment by Ferber Dan [ 08/Dec/14 ]
Thanks Yaguang. I have added cluster.manifest, plus etc.hosts.
I saw those errors also in vsm-api-log.txt, but in the log it looks the the parser then continues and successfully completes the parsing operations. When I saw these, I went back to a working
VSM 0.5.9 (pre-open source) VSM I have, and these same exact errors are in vsm-api-log, but
the VSM works fine.
Comment by Ferber Dan [ 08/Dec/14 ]
I changed the behavior section slightly just now in the original description. The symptom I am
now seeing matches exactly with as well, which is just a VSM 0.7.1 install attempt. I am also
adding my latest vsm-api.log file, called vsm-api-log-2.txt
Comment by Ferber Dan [ 08/Dec/14 ]
Ben Aquino suggested that as the current install grabs ceph 0.87, and as he recommends as there
may be VSM issues with ceph 0.87 - he suggests that I re-do my install from scratch but specify
ceph 0.86 in my public.repos file, as in the repo file below. I will plan to plan to do in the next
24 hours.
1. cat ceph806.repo
[ceph806]
name=ceph806
baseurl=http://192.168.100.11/ceph806
gpgcheck=0
enabled=1
proxy=none
Comment by aquino ben [ 08/Dec/14 ]
Replacing IP with Intel LAN IP.
1. cat ceph806.repo
[ceph806]
name=ceph806
baseurl=http://10.23.76.11/ceph806
gpgcheck=0
enabled=1
proxy=_none
Comment by aquino ben [ 08/Dec/14 ]
Thanks Dan! I thought install.MD also fails to pull MariaDB packages from public repos?
(should be added to JIRA-).
_Ben
Thanks Ben. I believe MariaDB installs fine for me, but I will check at next install, plus I will
keep a single install log, with all of the yum make cache, vsm repo build, and preinstall logs and
messages - so we can always go back and look - and will use your script to add it if it is not
installed.
_Dan
Comment by aquino ben [ 08/Dec/14 ]
My script to install MariaDB SQL.
1. cat install-ceph-Maria.sh
#!/bin/bash
yum install -y MariaDB-devel
yum install -y MariaDB-server
yum install -y MariaDB-client
yum install -y MariaDB-common
yum install -y ceph
ceph -v
rpm -qa |grep ceph
rpm -qa |grep Maria
#end of script
Comment by aquino ben [ 08/Dec/14 ]
Quick glance at this looks like wrong MariaDB packages and/or not all of mariaDB paks got
installed from public repos..
BenA
Comment by Wang Yaguang [ 08/Dec/14 ]
rpm list
Dan note: this is file rpms_list.txt, which is found in the vsm-0.8.0.tar.gz file in
https://github.com/01org/virtual-storage-manager/releases
Comment by Wang Yaguang [ 08/Dec/14 ]
There are issues to work with ceph 0.8.7, but as script is trying to get the latest version of
dependencies, it causes unexpected behaviors. the "rpm_list.txt" shows expected rpm versions.
Comment by Ferber Dan [ 08/Dec/14 ]
Taking this private to Intel, while we figure out how to modify INSTALL.md for VSM 080 at
least, so a open source user or partner can download the source, build, and install. Yaguang,
Ben, and Dan are the only watchers at this point anyway.
Comment by Ferber Dan [ 08/Dec/14 ]
While the list of 164 expected rpm versions is interesting and useful for our investigation here it is not helpful for people wanting to use VSM. We cannot expect each person who downloads
VSM to go find each of these RPMs.
For 0.5.9 Ben and I had a script that went and got all of these RPMs, but it takes a lot of work to
go find them all and then add to a script. And over time some of them become obsolete and
change locations.
Short term Yaguang will, I am sure, think about a solution - one that lets people install VSM
easily.
Longer term, we need to make VSM much less dependent on specific RPM versions of pre-req
packages.
Comment by Wang Yaguang [ 08/Dec/14 ]
For short term, we still need this rpm list for trouble shooting, and I will cook one script to
check if all required package are met for quick check.
Certainly, we'd respect the difficulties for build a flexible dependency solution, but longer term,
some aspects we could work on, i) figure out those packages easily cause problems, some in
mind are ceph, openstack, mariaDB, ii) fork those projects under 01org, iii) installation will
refer to those forked directly.
Comment by Ferber Dan [ 09/Dec/14 ]
Thanks Yaguang. For those at Intel following this bug, and in summary, there are several
external RPMs that are no longer exactly compatible with VSM. Historically the development
team has their own custom repos and builds from those. In VSM 0.5.9 Ben Aquino, Steve
Anderson, and I created a set of installation instructions along with a "get the correct RPM
versions" wget script, and then an "install all RPMs" script, which matched what the VSM
0.5.9's needed - and all was well. With these instructions, all installed fine.
Yaguang, is looking at those scripts to see if he wants to do something similar, for the open
source install - or maybe a hybrid process.
If anyone inside Intel needs to install and play with VSM right now, then Ben Aquino may be
able to hand you a working set of RPMs and the correct repos to install against - though if you
can wait a few days, I believe Yaguang will soon have a process that lets someone download
from github and get VSM installed and running. That is the goal.
Comment by Wang Yaguang [ 10/Dec/14 ]
I got some initial workaround for installation problem, and the investigation shows the root
cause is on python-django-horizon package. So the workaround is for users who already
installed vsm v0.8, but can’t open vsm controller web page
i) Remove installed vsm-dashboard and python-django-horizon package
 # rpm –e vsm-dashboard
 # rpm –e python-django-horizon
ii) Download rpm packages from vsm-dependencies github repository
(https://github.com/01org/vsm-dependencies/tree/master/repo ), below packages are required to
be downloaded from this web site:
 Python-django-horizon, python-quantumclient, python-swiftclient, python-cinderclient,
python-glanceclient, python-novaclient
iii) Reinstall python-django-horizon package
 # rpm –ivh python-quantumclient-2.2.1-2.el6.noarch.rpm
 # rpm -ivh python-swiftclient-1.4.0-1.el6.noarch.rpm
 # rpm –ivh python-cinderclient-1.0.4-1.el6.noarch.rpm
 # rpm –ivh python-glanceclient-0.9.0-2.el6.noarch.rpm
 # rpm –ivh python-novaclient-2.13.0-1.el6.noarch.rpm
 # rpm –ivh python-django-horizon-2013.1.1-1.el6.noarch.rpm
iv) Reinstall vsm-dashboard, this is exactly the same one from v0.8 release package
 # rpm –ivh vsm-dashboard-2014.11-0.8.0.el6.noarch.rpm
Comment by Ferber Dan [ 10/Dec/14 ]
Thanks Yaguang. I went through these steps, and then rebooted my vsm controller node. I did
not stop vsm controller before I did your steps - as stopping the vsm controller was not in the
instructions. But when I rebooted, I assume I picked up the new environments.
The only strange thing I saw when I downloaded (by cloning your vsm-dependencies repo) and
when I did the "rpm -ivh xxxx" was this NOKEY error, for each of the rpms I installed, for
example:
[root@vsmconsole vsmrpm]# rpm -ivh ./python-django-horizon-2013.1.1-1.el6.noarch.rpm
warning: ./python-django-horizon-2013.1.1-1.el6.noarch.rpm: Header V4 RSA/SHA1
Signature, key ID d97b3247: NOKEY
Preparing... ########################################### [100%]
1:python-django-horizon ########################################### [100%]
[root@vsmconsole vsmrpm]# rpm -ivh ./vsm-dashboard-2014.11-0.8.0.el6.noarch.rpm
Preparing... ########################################### [100%]
1:vsm-dashboard ########################################### [100%]
[root@vsmconsole vsmrpm]#
When my system came up after executing your instructions, I could connect to the dashboard
and I did not get the "something happened" error, but instead I got a "An error occurred
authenticating. Please try again later." error in the UI, and I was using the correct password, per
"cat /etc/vsmdeploy/deployrc |grep ADMIN_PASSWORD"
I have tar'd all of my https log files and I will attach here.
Let me know if you would like additional log or other information, or if you would like me to
try some different steps.
Comment by Ferber Dan [ 10/Dec/14 ]
SUCCESS!
Yaguang, I went back and installed from scratch, using the VSM 0.8.0 engineering build release
source. But after the "# preinstall" command step, I then executed your workaround commands.
Then I configured the cluster.manifest file, and started the vsm-controller for the first time. It
comes up and the UI connects fine. So all looks great.
Tomorrow (Thursday) I will create my three ceph storage nodes, provision my storage, install
my three vsm agents, and create the vsm cluster - and report back on the results.
Comment by Wang Yaguang [ 10/Dec/14 ]
Good news. looking forwarding to your runnable VSM deployment .
Comment by Ferber Dan [ 11/Dec/14 ]
More success! I generated three storage nodes, made them known to VSM, and successfully
created a cluster. The cluster is running file, and I can create pools and write data successfully.
I understand you will create “known issues” section in v0.8 release notes to reference .
I think that solves this for VSM 080.
I am not sure if you intend for this workaround to be the permanent fix, or not - for subsequent
engineering builds, or 1.0?
Comment by Wang Yaguang [ 25/Jan/15 ]
starting from v1.1, third-party packages will be checked and controlled to avoid mismatched
dependent packages are used and cause weird behaviors.
Generated at Sat Mar 05 19:12:51 PST 2016 using JIRA 6.3.14#6345sha1:47b2bb0a76c6e60bffb16fa45719b26a7e5e0c78.
Download