Archive
Grid Control: EMD upload error
Today I was installing the grid control agent in a new server, and I probably mistyped the password during the installation. I didn’t notice at the start, but the client was succesfully installed. However, it didn’t appear on my grid control maintenance window.
I connected to the brand new server and issued the following command:
/u01/app/oracle/product/grid_agent/agent10g/bin/emctl upload agent
And it produced this error:
EMD upload error: uploadXMLFiles skipped :: OMS version not checked yet..
This error was caused because I typed a bad password, and in order to change it, I had to remove first the password:
/u01/app/oracle/product/grid_agent/agent10g/bin/emctl unsecure agent
And then, securing the agent again:
/u01/app/oracle/product/grid_agent/agent10g/bin/emctl secure agent
It asked me for a new password, which I typed well this time, and everything started working fine.
Grid Control Problem
Today I noticed that my Grid Control was slower than normally, and when I try to connect to the Grid Control Console, sometimes I can connect, but other
times I get the following error:
Error
Authentication failed. Verify username/password that you have provided. If you believe you entered correct credentials, your account may have been locked,
contact system administrator to unlock your account.
—
Weird, I’m not typing bad my password, what could be wrong?
The last change I made in my server was modifying the file /etc/hosts , because the name of my machine was pointing to 127.0.0.1, and I wanted to change it
to his real IP. It looks that something went wrong.
Another symptom was revealed when I used the command netstat:
[root@superlopez ~]# netstat -punta | wc -l
42487
And 1 minute later I execute it again…
[root@superlopez ~]# netstat -punta | wc -l
62881
Definitely something was wrong, almost all the connections were of the following kind:
tcp 0 0 127.0.0.1:45118 127.0.0.1:6103 TIME_WAIT –
Ok, first step, this error must being logged somewhere, let’s search all the logs in the server and order it chronolocally:
[root@superlopez ~]# ls -larth $(find / -name “*.log” 2>/dev/null)
-rw-rw—- 1 oracle oinstall 18M may 4 08:23 /u01/oracle/app/10.2.0.4/oms10g/j2ee/home/log/home_default_island_1/server.log
-rw-rw—- 1 oracle oinstall 18M may 4 08:23 /u01/oracle/app/grid/oms10g/j2ee/OCMRepeater/log/OCMRepeater_default_island_1/server.log
-rw-rw—- 1 oracle oinstall 18M may 4 08:23 /u01/oracle/app/grid/oms10g/j2ee/OC4J_EMPROV/log/OC4J_EMPROV_default_island_1/server.log
-rw——- 1 oracle oinstall 18M may 4 08:23 /u01/oracle/app/grid/oms10g/j2ee/home/log/home_default_island_1/server.log
-rw-r—– 1 oracle oinstall 51M may 4 08:23 /san/datos/CATRMAN/redo02.log
-rw-rw—- 1 oracle oinstall 31M may 4 08:23 /u01/oracle/app/grid/oms10g/j2ee/OC4J_EM/log/OC4J_EM_default_island_1/default-web-access.log
These are the logs written in the last minute. I don’t think the poor redo log gives me any information, but the other logs might insight me.
… But none of them gave me any information at all. However, the log /u01/oracle/app/10.2.0.4/oms10g/opmn/logs/ons.log shows the following error several
times, and the log occupies 1Gb:
10/05/03 12:44:08 [4] Falta el factor de formato de la conexión local 0,127.0.0.1,6103
<unknown>
So, there is a problem with the port 6103. After googling a bit, I modified the file /u01/oracle/app/10.2.0.4/opmn/conf/ons.config and changed:
localport=6103
remoteport=6203
loglevel=3
to
localport=6115
remoteport=6205
loglevel=3
After that I rebooted…
[root@superlopez ~]# netstat -punta | wc -l
177
Problem solved! It looks like the port 6103 was in onflict with another service, so after changing the port in the previous file, everything started working
fine.
One problem less…