soc: qcom: Add snapshot of System Health Monitor
This snapshot is taken as of msm-3.18 commit e70ad0cd (Promotion of kernel.lnx.3.18-151201.) Signed-off-by: Karthikeyan Ramasubramanian <kramasub@codeaurora.org>
This commit is contained in:
parent
856dcc3bd8
commit
a8d687d1ee
7 changed files with 1424 additions and 0 deletions
213
Documentation/arm/msm/system_health_monitor.txt
Normal file
213
Documentation/arm/msm/system_health_monitor.txt
Normal file
|
@ -0,0 +1,213 @@
|
|||
Introduction
|
||||
============
|
||||
|
||||
System Health Monitor (SHM) passively monitors the health of the
|
||||
peripherals connected to the application processor. Software components
|
||||
in the application processor that experience communication failure can
|
||||
request the SHM to perform a system-wide health check. If any failures
|
||||
are detected during the health-check, then a subsystem restart will be
|
||||
triggered for the failed subsystem.
|
||||
|
||||
Hardware description
|
||||
====================
|
||||
|
||||
SHM is solely a software component and it interfaces with peripherals
|
||||
through QMI communication. SHM does not control any hardware blocks and
|
||||
it uses subsystem_restart to restart any peripheral.
|
||||
|
||||
Software description
|
||||
====================
|
||||
|
||||
SHM hosts a QMI service in the kernel that is connected to the Health
|
||||
Monitor Agents (HMA) hosted in the peripherals. HMAs in the peripherals
|
||||
are initialized along with other critical services in the peripherals and
|
||||
hence the connection between SHM and HMAs are established during the early
|
||||
stages of the peripheral boot-up procedure. Software components within the
|
||||
application processor, either user-space or kernel-space, identify any
|
||||
communication failure with the peripheral by a lack of response and report
|
||||
that failure to SHM. SHM checks the health of the entire system through
|
||||
HMAs that are connected to it. If all the HMAs respond in time, then the
|
||||
failure report by the software component is ignored. If any HMAs do not
|
||||
respond in time, then SHM will restart the concerned peripheral. Figure 1
|
||||
shows a high level design diagram and Figure 2 shows a flow diagram of the
|
||||
design.
|
||||
|
||||
Figure 1 - System Health Monitor Overview:
|
||||
|
||||
+------------------------------------+ +----------------------+
|
||||
| Application Processor | | Peripheral 1 |
|
||||
| +--------------+ | | +----------------+ |
|
||||
| | Applications | | | | Health Monitor | |
|
||||
| +------+-------+ | +------->| Agent 1 | |
|
||||
| User-space | | | | +----------------+ |
|
||||
+-------------------------|----------+ | +----------------------+
|
||||
| Kernel-space v | QMI .
|
||||
| +---------+ +---------------+ | | .
|
||||
| | Kernel |----->| System Health |<----+ .
|
||||
| | Drivers | | Monitor | | |
|
||||
| +---------+ +---------------+ | QMI +----------------------+
|
||||
| | | | Peripheral N |
|
||||
| | | | +----------------+ |
|
||||
| | | | | Health Monitor | |
|
||||
| | +------->| Agent N | |
|
||||
| | | +----------------+ |
|
||||
+------------------------------------+ +----------------------+
|
||||
|
||||
|
||||
Figure 2 - System Health Monitor Message Flow with 2 peripherals:
|
||||
|
||||
+-----------+ +-------+ +-------+ +-------+
|
||||
|Application| | SHM | | HMA 1 | | HMA 2 |
|
||||
+-----+-----+ +-------+ +---+---+ +---+---+
|
||||
| | | |
|
||||
| | | |
|
||||
| check_system | | |
|
||||
|------------------->| | |
|
||||
| _health() | Report_ | |
|
||||
| |---------------->| |
|
||||
| | health_req(1) | |
|
||||
| | | |
|
||||
| | Report_ | |
|
||||
| |---------------------------------->|
|
||||
| +-+ health_req(2) | |
|
||||
| |T| | |
|
||||
| |i| | |
|
||||
| |m| | |
|
||||
| |e| Report_ | |
|
||||
| |o|<---------------| |
|
||||
| |u| health_resp(1) | |
|
||||
| |t| | |
|
||||
| +-+ | |
|
||||
| | subsystem_ | |
|
||||
| |---------------------------------->|
|
||||
| | restart(2) | |
|
||||
+ + + +
|
||||
|
||||
HMAs can be extended to monitor the health of individual software services
|
||||
executing in their concerned peripherals. HMAs can restore the services
|
||||
that are not responding to a responsive state.
|
||||
|
||||
Design
|
||||
======
|
||||
|
||||
The design goal of SHM is to:
|
||||
* Restore the unresponsive peripheral to a responsive state.
|
||||
* Restore the unresponsive software services in a peripheral to a
|
||||
responsive state.
|
||||
* Perform power-efficient monitoring of the system health.
|
||||
|
||||
The alternate design discussion includes sending keepalive messages in
|
||||
IPC protocols at Transport Layer. This approach requires rolling out the
|
||||
protocol update in all the peripherals together and hence has considerable
|
||||
coupling unless a suitable feature negotiation algorithm is implemented.
|
||||
This approach also requires all the IPC protocols at transport layer to be
|
||||
updated and hence replication of effort. There are multiple link-layer
|
||||
protocols and adding keep-alive at the link-layer protocols does not solve
|
||||
issues at the client layer which is solved by SHM. Restoring a peripheral
|
||||
or a remote software service by an IPC protocol has not been an industry
|
||||
standard practice. Industry standard IPC protocols only terminate the
|
||||
connection if there is any communication failure and rely upon other
|
||||
mechanisms to restore the system to full operation.
|
||||
|
||||
Power Management
|
||||
================
|
||||
|
||||
This driver ensures that the health monitor messages are sent only upon
|
||||
request and hence does not wake up application processor or any peripheral
|
||||
unnecessarily.
|
||||
|
||||
SMP/multi-core
|
||||
==============
|
||||
|
||||
This driver uses standard kernel mutexes and wait queues to achieve any
|
||||
required synchronization.
|
||||
|
||||
Security
|
||||
========
|
||||
|
||||
Denial of Service (DoS) attack by an application that keeps requesting
|
||||
health checks at a high rate can be throttled by the SHM to minimize the
|
||||
impact of the misbehaving application.
|
||||
|
||||
Interface
|
||||
=========
|
||||
|
||||
Kernel-space APIs:
|
||||
------------------
|
||||
/**
|
||||
* kern_check_system_health() - Check the system health
|
||||
*
|
||||
* @return: 0 on success, standard Linux error codes on failure.
|
||||
*
|
||||
* This function is used by the kernel drivers to initiate the
|
||||
* system health check. This function in turn trigger SHM to send
|
||||
* QMI message to all the HMAs connected to it.
|
||||
*/
|
||||
int kern_check_system_health(void);
|
||||
|
||||
User-space Interface:
|
||||
---------------------
|
||||
This driver provides a devfs interface(/dev/system_health_monitor) to the
|
||||
user-space. A wrapper API library will be provided to the user-space
|
||||
applications in order to initiate the system health check. The API in turn
|
||||
will interface with the driver through the sysfs interface provided by the
|
||||
driver.
|
||||
|
||||
/**
|
||||
* check_system_health() - Check the system health
|
||||
*
|
||||
* @return: 0 on success, -1 on failure.
|
||||
*
|
||||
* This function is used by the user-space applications to initiate the
|
||||
* system health check. This function in turn trigger SHM to send QMI
|
||||
* message to all the HMAs connected to it.
|
||||
*/
|
||||
int check_system_health(void);
|
||||
|
||||
The above mentioned interface function works by opening the sysfs
|
||||
interface provided by SHM, perform an ioctl operation and then close the
|
||||
sysfs interface. The concerned ioctl command(CHECK_SYS_HEALTH_IOCTL) does
|
||||
not take any argument. This function performs the health check, handles the
|
||||
response and timeout in an asynchronous manner.
|
||||
|
||||
Driver parameters
|
||||
=================
|
||||
|
||||
The time duration for which the SHM has to wait before a response
|
||||
arrives from HMAs can be configured using a module parameter. This
|
||||
parameter will be used only for debugging purposes. The default SHM health
|
||||
check timeout is 2s, which can be overwritten by the timeout provided by
|
||||
HMA during the connection establishment.
|
||||
|
||||
Config options
|
||||
==============
|
||||
|
||||
This driver is enabled through kernel config option
|
||||
CONFIG_SYSTEM_HEALTH_MONITOR.
|
||||
|
||||
Dependencies
|
||||
============
|
||||
|
||||
This driver depends on the following kernel modules for its complete
|
||||
functionality:
|
||||
* Kernel QMI interface
|
||||
* Subsystem Restart support
|
||||
|
||||
User space utilities
|
||||
====================
|
||||
|
||||
Any user-space or kernel-space modules that experience communication
|
||||
failure with peripherals will interface with this driver. Some of the
|
||||
modules include:
|
||||
* RIL
|
||||
* Location Manager
|
||||
* Data Services
|
||||
|
||||
Other
|
||||
=====
|
||||
|
||||
SHM provides a debug interface to enumerate some information regarding the
|
||||
recent health checks. The debug information includes, but not limited to:
|
||||
* application name that triggered the health check.
|
||||
* time of the health check.
|
||||
* status of the health check.
|
|
@ -211,6 +211,17 @@ config MSM_IPC_ROUTER_GLINK_XPRT
|
|||
this layer registers a transport with IPC Router and enable
|
||||
message exchange.
|
||||
|
||||
config MSM_SYSTEM_HEALTH_MONITOR
|
||||
bool "System Health Monitor"
|
||||
depends on MSM_QMI_INTERFACE && MSM_SUBSYSTEM_RESTART
|
||||
help
|
||||
System Health Monitor (SHM) passively monitors the health of the
|
||||
peripherals connected to the application processor. Software
|
||||
components in the application processor that experience
|
||||
communication failure can request the SHM to perform a system-wide
|
||||
health check. If any failures are detected during the health-check,
|
||||
then a subsystem restart will be triggered for the failed subsystem.
|
||||
|
||||
config MSM_GLINK_PKT
|
||||
bool "Enable device interface for GLINK packet channels"
|
||||
depends on MSM_GLINK
|
||||
|
|
|
@ -18,6 +18,8 @@ obj-$(CONFIG_MSM_IPC_ROUTER_SMD_XPRT) += ipc_router_smd_xprt.o
|
|||
obj-$(CONFIG_MSM_IPC_ROUTER_HSIC_XPRT) += ipc_router_hsic_xprt.o
|
||||
obj-$(CONFIG_MSM_IPC_ROUTER_MHI_XPRT) += ipc_router_mhi_xprt.o
|
||||
obj-$(CONFIG_MSM_IPC_ROUTER_GLINK_XPRT) += ipc_router_glink_xprt.o
|
||||
obj-$(CONFIG_MSM_SYSTEM_HEALTH_MONITOR) += system_health_monitor_v01.o
|
||||
obj-$(CONFIG_MSM_SYSTEM_HEALTH_MONITOR) += system_health_monitor.o
|
||||
obj-$(CONFIG_MSM_GLINK_PKT) += msm_glink_pkt.o
|
||||
|
||||
obj-$(CONFIG_MEM_SHARE_QMI_SERVICE) += memshare/
|
||||
|
|
964
drivers/soc/qcom/system_health_monitor.c
Normal file
964
drivers/soc/qcom/system_health_monitor.c
Normal file
|
@ -0,0 +1,964 @@
|
|||
/* Copyright (c) 2014, The Linux Foundation. All rights reserved.
|
||||
*
|
||||
* This program is free software; you can redistribute it and/or modify
|
||||
* it under the terms of the GNU General Public License version 2 and
|
||||
* only version 2 as published by the Free Software Foundation.
|
||||
*
|
||||
* This program is distributed in the hope that it will be useful,
|
||||
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
* GNU General Public License for more details.
|
||||
*/
|
||||
#include <linux/cdev.h>
|
||||
#include <linux/delay.h>
|
||||
#include <linux/device.h>
|
||||
#include <linux/errno.h>
|
||||
#include <linux/fs.h>
|
||||
#include <linux/ioctl.h>
|
||||
#include <linux/ipc_logging.h>
|
||||
#include <linux/module.h>
|
||||
#include <linux/of.h>
|
||||
#include <linux/platform_device.h>
|
||||
#include <linux/qmi_encdec.h>
|
||||
#include <linux/ratelimit.h>
|
||||
#include <linux/sched.h>
|
||||
#include <linux/slab.h>
|
||||
#include <linux/srcu.h>
|
||||
#include <linux/thread_info.h>
|
||||
#include <linux/uaccess.h>
|
||||
|
||||
#include <soc/qcom/msm_qmi_interface.h>
|
||||
#include <soc/qcom/subsystem_notif.h>
|
||||
#include <soc/qcom/subsystem_restart.h>
|
||||
|
||||
#include "system_health_monitor_v01.h"
|
||||
|
||||
#define MODULE_NAME "system_health_monitor"
|
||||
|
||||
#define SUBSYS_NAME_LEN 256
|
||||
#define SSRESTART_STRLEN 256
|
||||
|
||||
enum {
|
||||
SHM_INFO_FLAG = 0x1,
|
||||
SHM_DEBUG_FLAG = 0x2,
|
||||
};
|
||||
static int shm_debug_mask = SHM_INFO_FLAG;
|
||||
module_param_named(debug_mask, shm_debug_mask,
|
||||
int, S_IRUGO | S_IWUSR | S_IWGRP);
|
||||
static int shm_default_timeout_ms = 2000;
|
||||
module_param_named(default_timeout_ms, shm_default_timeout_ms,
|
||||
int, S_IRUGO | S_IWUSR | S_IWGRP);
|
||||
|
||||
#define DEFAULT_SHM_RATELIMIT_INTERVAL (HZ / 5)
|
||||
#define DEFAULT_SHM_RATELIMIT_BURST 2
|
||||
|
||||
#define SHM_ILCTXT_NUM_PAGES 2
|
||||
static void *shm_ilctxt;
|
||||
#define SHM_INFO_LOG(x...) do { \
|
||||
if ((shm_debug_mask & SHM_INFO_FLAG) && shm_ilctxt) \
|
||||
ipc_log_string(shm_ilctxt, x); \
|
||||
} while (0)
|
||||
|
||||
#define SHM_DEBUG(x...) do { \
|
||||
if ((shm_debug_mask & SHM_DEBUG_FLAG) && shm_ilctxt) \
|
||||
ipc_log_string(shm_ilctxt, x); \
|
||||
} while (0)
|
||||
|
||||
#define SHM_ERR(x...) do { \
|
||||
if (shm_ilctxt) \
|
||||
ipc_log_string(shm_ilctxt, x); \
|
||||
pr_err(x); \
|
||||
} while (0)
|
||||
|
||||
struct class *system_health_monitor_classp;
|
||||
static dev_t system_health_monitor_dev;
|
||||
static struct cdev system_health_monitor_cdev;
|
||||
static struct device *system_health_monitor_devp;
|
||||
|
||||
#define SYSTEM_HEALTH_MONITOR_IOCTL_MAGIC (0xC3)
|
||||
|
||||
#define CHECK_SYSTEM_HEALTH_IOCTL \
|
||||
_IOR(SYSTEM_HEALTH_MONITOR_IOCTL_MAGIC, 0, unsigned int)
|
||||
|
||||
static struct workqueue_struct *shm_svc_workqueue;
|
||||
static void shm_svc_recv_msg(struct work_struct *work);
|
||||
static DECLARE_DELAYED_WORK(work_recv_msg, shm_svc_recv_msg);
|
||||
static struct qmi_handle *shm_svc_handle;
|
||||
|
||||
struct disconnect_work {
|
||||
struct work_struct work;
|
||||
void *conn_h;
|
||||
};
|
||||
static void shm_svc_disconnect_worker(struct work_struct *work);
|
||||
|
||||
struct req_work {
|
||||
struct work_struct work;
|
||||
void *conn_h;
|
||||
void *req_h;
|
||||
unsigned int msg_id;
|
||||
void *req;
|
||||
};
|
||||
static void shm_svc_req_worker(struct work_struct *work);
|
||||
|
||||
/**
|
||||
* struct hma_info - Information about a Health Monitor Agent(HMA)
|
||||
* @list: List to chain up the hma to the hma_list.
|
||||
* @subsys_name: Name of the remote subsystem that hosts this HMA.
|
||||
* @ssrestart_string: String to restart the subsystem that hosts this HMA.
|
||||
* @conn_h: Opaque connection handle to the HMA.
|
||||
* @timeout: Timeout as registered by the HMA.
|
||||
* @check_count: Count of the health check attempts.
|
||||
* @report_count: Count of the health reports handled.
|
||||
* @reset_srcu: Sleepable RCU to protect the reset state.
|
||||
* @is_in_reset: Flag to identify if the remote subsystem is in reset.
|
||||
* @restart_nb: Notifier block to receive subsystem restart events.
|
||||
* @restart_nb_h: Handle to subsystem restart notifier block.
|
||||
* @rs: Rate-limit the health check.
|
||||
*/
|
||||
struct hma_info {
|
||||
struct list_head list;
|
||||
char subsys_name[SUBSYS_NAME_LEN];
|
||||
char ssrestart_string[SSRESTART_STRLEN];
|
||||
void *conn_h;
|
||||
uint32_t timeout;
|
||||
atomic_t check_count;
|
||||
atomic_t report_count;
|
||||
struct srcu_struct reset_srcu;
|
||||
atomic_t is_in_reset;
|
||||
struct notifier_block restart_nb;
|
||||
void *restart_nb_h;
|
||||
struct ratelimit_state rs;
|
||||
};
|
||||
|
||||
struct restart_work {
|
||||
struct delayed_work dwork;
|
||||
struct hma_info *hmap;
|
||||
void *conn_h;
|
||||
int check_count;
|
||||
};
|
||||
static void shm_svc_restart_worker(struct work_struct *work);
|
||||
|
||||
static DEFINE_MUTEX(hma_info_list_lock);
|
||||
static LIST_HEAD(hma_info_list);
|
||||
|
||||
static struct msg_desc shm_svc_register_req_desc = {
|
||||
.max_msg_len = HMON_REGISTER_REQ_MSG_V01_MAX_MSG_LEN,
|
||||
.msg_id = QMI_HEALTH_MON_REG_REQ_V01,
|
||||
.ei_array = hmon_register_req_msg_v01_ei,
|
||||
};
|
||||
|
||||
static struct msg_desc shm_svc_register_resp_desc = {
|
||||
.max_msg_len = HMON_REGISTER_RESP_MSG_V01_MAX_MSG_LEN,
|
||||
.msg_id = QMI_HEALTH_MON_REG_RESP_V01,
|
||||
.ei_array = hmon_register_resp_msg_v01_ei,
|
||||
};
|
||||
|
||||
static struct msg_desc shm_svc_health_check_ind_desc = {
|
||||
.max_msg_len = HMON_HEALTH_CHECK_IND_MSG_V01_MAX_MSG_LEN,
|
||||
.msg_id = QMI_HEALTH_MON_HEALTH_CHECK_IND_V01,
|
||||
.ei_array = hmon_health_check_ind_msg_v01_ei,
|
||||
};
|
||||
|
||||
static struct msg_desc shm_svc_health_check_complete_req_desc = {
|
||||
.max_msg_len = HMON_HEALTH_CHECK_COMPLETE_REQ_MSG_V01_MAX_MSG_LEN,
|
||||
.msg_id = QMI_HEALTH_MON_HEALTH_CHECK_COMPLETE_REQ_V01,
|
||||
.ei_array = hmon_health_check_complete_req_msg_v01_ei,
|
||||
};
|
||||
|
||||
static struct msg_desc shm_svc_health_check_complete_resp_desc = {
|
||||
.max_msg_len = HMON_HEALTH_CHECK_COMPLETE_RESP_MSG_V01_MAX_MSG_LEN,
|
||||
.msg_id = QMI_HEALTH_MON_HEALTH_CHECK_COMPLETE_RESP_V01,
|
||||
.ei_array = hmon_health_check_complete_resp_msg_v01_ei,
|
||||
};
|
||||
|
||||
/**
|
||||
* restart_notifier_cb() - Callback to handle SSR events
|
||||
* @this: Reference to the notifier block.
|
||||
* @code: Type of SSR event.
|
||||
* @data: Data that needs to be handled as part of SSR event.
|
||||
*
|
||||
* This function is used to identify if a subsystem which hosts an HMA
|
||||
* is already in reset, so that a duplicate subsystem restart is not
|
||||
* triggered.
|
||||
*
|
||||
* Return: 0 on success, standard Linux error codes on failure.
|
||||
*/
|
||||
static int restart_notifier_cb(struct notifier_block *this,
|
||||
unsigned long code, void *data)
|
||||
{
|
||||
struct hma_info *tmp_hma_info =
|
||||
container_of(this, struct hma_info, restart_nb);
|
||||
|
||||
if (code == SUBSYS_BEFORE_SHUTDOWN) {
|
||||
atomic_set(&tmp_hma_info->is_in_reset, 1);
|
||||
synchronize_srcu(&tmp_hma_info->reset_srcu);
|
||||
SHM_INFO_LOG("%s: %s going to shutdown\n",
|
||||
__func__, tmp_hma_info->ssrestart_string);
|
||||
} else if (code == SUBSYS_AFTER_POWERUP) {
|
||||
atomic_set(&tmp_hma_info->is_in_reset, 0);
|
||||
SHM_INFO_LOG("%s: %s powered up\n",
|
||||
__func__, tmp_hma_info->ssrestart_string);
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
/**
|
||||
* shm_svc_restart_worker() - Worker to restart a subsystem
|
||||
* @work: Reference to the work item being handled.
|
||||
*
|
||||
* This function restarts the subsystem which hosts an HMA. This function
|
||||
* checks the following before triggering a restart:
|
||||
* 1) Health check report is not received.
|
||||
* 2) The subsystem has not undergone a reset.
|
||||
* 3) The subsystem is not undergoing a reset.
|
||||
*/
|
||||
static void shm_svc_restart_worker(struct work_struct *work)
|
||||
{
|
||||
int rc;
|
||||
struct delayed_work *dwork = to_delayed_work(work);
|
||||
struct restart_work *rwp =
|
||||
container_of(dwork, struct restart_work, dwork);
|
||||
struct hma_info *tmp_hma_info = rwp->hmap;
|
||||
int rcu_id;
|
||||
|
||||
if (rwp->check_count <= atomic_read(&tmp_hma_info->report_count)) {
|
||||
SHM_INFO_LOG("%s: No Action on Health Check Attempt %d to %s\n",
|
||||
__func__, rwp->check_count,
|
||||
tmp_hma_info->subsys_name);
|
||||
kfree(rwp);
|
||||
return;
|
||||
}
|
||||
|
||||
if (!tmp_hma_info->conn_h || rwp->conn_h != tmp_hma_info->conn_h) {
|
||||
SHM_INFO_LOG(
|
||||
"%s: Connection to %s is reset. No further action\n",
|
||||
__func__, tmp_hma_info->subsys_name);
|
||||
kfree(rwp);
|
||||
return;
|
||||
}
|
||||
|
||||
rcu_id = srcu_read_lock(&tmp_hma_info->reset_srcu);
|
||||
if (atomic_read(&tmp_hma_info->is_in_reset)) {
|
||||
SHM_INFO_LOG(
|
||||
"%s: %s is going thru restart. No further action\n",
|
||||
__func__, tmp_hma_info->subsys_name);
|
||||
srcu_read_unlock(&tmp_hma_info->reset_srcu, rcu_id);
|
||||
kfree(rwp);
|
||||
return;
|
||||
}
|
||||
|
||||
SHM_ERR("%s: HMA in %s failed to respond in time. Restarting %s...\n",
|
||||
__func__, tmp_hma_info->subsys_name,
|
||||
tmp_hma_info->ssrestart_string);
|
||||
rc = subsystem_restart(tmp_hma_info->ssrestart_string);
|
||||
if (rc < 0)
|
||||
SHM_ERR("%s: Error %d restarting %s\n",
|
||||
__func__, rc, tmp_hma_info->ssrestart_string);
|
||||
srcu_read_unlock(&tmp_hma_info->reset_srcu, rcu_id);
|
||||
kfree(rwp);
|
||||
}
|
||||
|
||||
/**
|
||||
* shm_send_health_check_ind() - Initiate a subsystem health check
|
||||
* @tmp_hma_info: Info about an HMA which resides in a subsystem.
|
||||
*
|
||||
* This function initiates a health check of a subsytem, which hosts the
|
||||
* HMA, by sending a health check QMI indication message.
|
||||
*
|
||||
* Return: 0 on success, standard Linux error codes on failure.
|
||||
*/
|
||||
static int shm_send_health_check_ind(struct hma_info *tmp_hma_info)
|
||||
{
|
||||
int rc;
|
||||
struct restart_work *rwp;
|
||||
|
||||
if (!tmp_hma_info->conn_h)
|
||||
return 0;
|
||||
|
||||
/* Rate limit the health check as configured by the subsystem */
|
||||
if (!__ratelimit(&tmp_hma_info->rs))
|
||||
return 0;
|
||||
|
||||
rwp = kzalloc(sizeof(*rwp), GFP_KERNEL);
|
||||
if (!rwp) {
|
||||
SHM_ERR("%s: Error allocating restart work\n", __func__);
|
||||
return -ENOMEM;
|
||||
}
|
||||
|
||||
INIT_DELAYED_WORK(&rwp->dwork, shm_svc_restart_worker);
|
||||
rwp->hmap = tmp_hma_info;
|
||||
rwp->conn_h = tmp_hma_info->conn_h;
|
||||
|
||||
rc = qmi_send_ind(shm_svc_handle, tmp_hma_info->conn_h,
|
||||
&shm_svc_health_check_ind_desc, NULL, 0);
|
||||
if (rc < 0) {
|
||||
SHM_ERR("%s: Send Error %d to %s\n",
|
||||
__func__, rc, tmp_hma_info->subsys_name);
|
||||
kfree(rwp);
|
||||
return rc;
|
||||
}
|
||||
|
||||
rwp->check_count = atomic_inc_return(&tmp_hma_info->check_count);
|
||||
queue_delayed_work(shm_svc_workqueue, &rwp->dwork,
|
||||
msecs_to_jiffies(tmp_hma_info->timeout));
|
||||
return 0;
|
||||
}
|
||||
|
||||
/**
|
||||
* kern_check_system_health() - Check the system health
|
||||
*
|
||||
* This function is used by the kernel drivers to initiate the
|
||||
* system health check. This function in turn triggers SHM to send
|
||||
* QMI message to all the HMAs connected to it.
|
||||
*
|
||||
* Return: 0 on success, standard Linux error codes on failure.
|
||||
*/
|
||||
int kern_check_system_health(void)
|
||||
{
|
||||
int rc;
|
||||
int final_rc = 0;
|
||||
struct hma_info *tmp_hma_info;
|
||||
|
||||
mutex_lock(&hma_info_list_lock);
|
||||
list_for_each_entry(tmp_hma_info, &hma_info_list, list) {
|
||||
rc = shm_send_health_check_ind(tmp_hma_info);
|
||||
if (rc < 0) {
|
||||
SHM_ERR("%s by %s failed for %s - rc %d\n", __func__,
|
||||
current->comm, tmp_hma_info->subsys_name, rc);
|
||||
final_rc = rc;
|
||||
}
|
||||
}
|
||||
mutex_unlock(&hma_info_list_lock);
|
||||
return final_rc;
|
||||
}
|
||||
EXPORT_SYMBOL(kern_check_system_health);
|
||||
|
||||
/**
|
||||
* shm_svc_connect_cb() - Callback to handle connect event from an HMA
|
||||
* @handle: QMI Service handle in which a connect event is received.
|
||||
* @conn_h: Opaque reference to the connection handle.
|
||||
*
|
||||
* Return: 0 on success, standard Linux error codes on failure.
|
||||
*/
|
||||
static int shm_svc_connect_cb(struct qmi_handle *handle, void *conn_h)
|
||||
{
|
||||
SHM_DEBUG("%s: conn_h %p\n", __func__, conn_h);
|
||||
return 0;
|
||||
}
|
||||
|
||||
/**
|
||||
* shm_svc_disconnect_worker() - Worker to handle disconnect event from an HMA
|
||||
* @work: Reference to the work item.
|
||||
*
|
||||
* This function handles the disconnect event from an HMA in a deferred manner.
|
||||
*/
|
||||
static void shm_svc_disconnect_worker(struct work_struct *work)
|
||||
{
|
||||
struct hma_info *tmp_hma_info;
|
||||
struct disconnect_work *dwp =
|
||||
container_of(work, struct disconnect_work, work);
|
||||
|
||||
mutex_lock(&hma_info_list_lock);
|
||||
list_for_each_entry(tmp_hma_info, &hma_info_list, list) {
|
||||
if (dwp->conn_h == tmp_hma_info->conn_h) {
|
||||
SHM_INFO_LOG("%s: conn_h %p to HMA in %s exited\n",
|
||||
__func__, dwp->conn_h,
|
||||
tmp_hma_info->subsys_name);
|
||||
tmp_hma_info->conn_h = NULL;
|
||||
atomic_set(&tmp_hma_info->report_count,
|
||||
atomic_read(&tmp_hma_info->check_count));
|
||||
break;
|
||||
}
|
||||
}
|
||||
mutex_unlock(&hma_info_list_lock);
|
||||
kfree(dwp);
|
||||
}
|
||||
|
||||
/**
|
||||
* shm_svc_disconnect_cb() - Callback to handle disconnect event from an HMA
|
||||
* @handle: QMI Service handle in which a disconnect event is received.
|
||||
* @conn_h: Opaque reference to the connection handle.
|
||||
*
|
||||
* Return: 0 on success, standard Linux error codes on failure.
|
||||
*/
|
||||
static int shm_svc_disconnect_cb(struct qmi_handle *handle, void *conn_h)
|
||||
{
|
||||
struct disconnect_work *dwp;
|
||||
|
||||
dwp = kzalloc(sizeof(*dwp), GFP_ATOMIC);
|
||||
if (!dwp) {
|
||||
SHM_ERR("%s: Error allocating work item\n", __func__);
|
||||
return -ENOMEM;
|
||||
}
|
||||
|
||||
INIT_WORK(&dwp->work, shm_svc_disconnect_worker);
|
||||
dwp->conn_h = conn_h;
|
||||
queue_work(shm_svc_workqueue, &dwp->work);
|
||||
return 0;
|
||||
}
|
||||
|
||||
/**
|
||||
* shm_svc_req_desc_cb() - Callback to identify the request descriptor
|
||||
* @msg_id: Message ID of the QMI request.
|
||||
* @req_desc: Request Descriptor of the QMI request.
|
||||
*
|
||||
* Return: 0 on success, standard Linux error codes on failure.
|
||||
*/
|
||||
static int shm_svc_req_desc_cb(unsigned int msg_id,
|
||||
struct msg_desc **req_desc)
|
||||
{
|
||||
int rc;
|
||||
SHM_DEBUG("%s: called for msg_id %d\n", __func__, msg_id);
|
||||
switch (msg_id) {
|
||||
case QMI_HEALTH_MON_REG_REQ_V01:
|
||||
*req_desc = &shm_svc_register_req_desc;
|
||||
rc = sizeof(struct hmon_register_req_msg_v01);
|
||||
break;
|
||||
|
||||
case QMI_HEALTH_MON_HEALTH_CHECK_COMPLETE_REQ_V01:
|
||||
*req_desc = &shm_svc_health_check_complete_req_desc;
|
||||
rc = sizeof(struct hmon_health_check_complete_req_msg_v01);
|
||||
break;
|
||||
|
||||
default:
|
||||
SHM_ERR("%s: Invalid msg_id %d\n", __func__, msg_id);
|
||||
rc = -ENOTSUPP;
|
||||
}
|
||||
return rc;
|
||||
}
|
||||
|
||||
/**
|
||||
* handle_health_mon_reg_req() - Handle the HMA register QMI request
|
||||
* @conn_h: Opaque reference to the connection handle to an HMA.
|
||||
* @req_h: Opaque reference to the request handle.
|
||||
* @buf: Pointer to the QMI request structure.
|
||||
*
|
||||
* This function handles the register request from an HMA. The request
|
||||
* contains the subsystem name which hosts the HMA and health check
|
||||
* timeout for the HMA.
|
||||
*
|
||||
* Return: 0 on success, standard Linux error codes on failure.
|
||||
*/
|
||||
static int handle_health_mon_reg_req(void *conn_h, void *req_h, void *buf)
|
||||
{
|
||||
int rc;
|
||||
struct hma_info *tmp_hma_info;
|
||||
struct hmon_register_req_msg_v01 *req =
|
||||
(struct hmon_register_req_msg_v01 *)buf;
|
||||
struct hmon_register_resp_msg_v01 resp;
|
||||
bool hma_info_found = false;
|
||||
|
||||
if (!req->name_valid) {
|
||||
SHM_ERR("%s: host name invalid\n", __func__);
|
||||
goto send_reg_resp;
|
||||
}
|
||||
|
||||
mutex_lock(&hma_info_list_lock);
|
||||
list_for_each_entry(tmp_hma_info, &hma_info_list, list) {
|
||||
if (!strcmp(tmp_hma_info->subsys_name, req->name) &&
|
||||
!tmp_hma_info->conn_h) {
|
||||
tmp_hma_info->conn_h = conn_h;
|
||||
if (req->timeout_valid)
|
||||
tmp_hma_info->timeout = req->timeout;
|
||||
else
|
||||
tmp_hma_info->timeout = shm_default_timeout_ms;
|
||||
ratelimit_state_init(&tmp_hma_info->rs,
|
||||
DEFAULT_SHM_RATELIMIT_INTERVAL,
|
||||
DEFAULT_SHM_RATELIMIT_BURST);
|
||||
SHM_INFO_LOG("%s: from %s timeout_ms %d\n",
|
||||
__func__, req->name, tmp_hma_info->timeout);
|
||||
hma_info_found = true;
|
||||
} else if (!strcmp(tmp_hma_info->subsys_name, req->name)) {
|
||||
SHM_ERR("%s: Duplicate HMA from %s - cur %p, new %p\n",
|
||||
__func__, req->name, tmp_hma_info->conn_h,
|
||||
conn_h);
|
||||
}
|
||||
}
|
||||
mutex_unlock(&hma_info_list_lock);
|
||||
|
||||
send_reg_resp:
|
||||
if (hma_info_found) {
|
||||
memset(&resp, 0, sizeof(resp));
|
||||
} else {
|
||||
resp.resp.result = QMI_RESULT_FAILURE_V01;
|
||||
resp.resp.error = QMI_ERR_INVALID_ID_V01;
|
||||
}
|
||||
rc = qmi_send_resp(shm_svc_handle, conn_h, req_h,
|
||||
&shm_svc_register_resp_desc, &resp, sizeof(resp));
|
||||
if (rc < 0)
|
||||
SHM_ERR("%s: send_resp failed to %s - rc %d\n",
|
||||
__func__, req->name, rc);
|
||||
return rc;
|
||||
}
|
||||
|
||||
/**
|
||||
* handle_health_mon_health_check_complete_req() - Handle the HMA health report
|
||||
* @conn_h: Opaque reference to the connection handle to an HMA.
|
||||
* @req_h: Opaque reference to the request handle.
|
||||
* @buf: Pointer to the QMI request structure.
|
||||
*
|
||||
* This function handles health reports from an HMA. The health report is sent
|
||||
* in response to a health check QMI indication sent by SHM.
|
||||
*
|
||||
* Return: 0 on success, standard Linux error codes on failure.
|
||||
*/
|
||||
static int handle_health_mon_health_check_complete_req(void *conn_h,
|
||||
void *req_h, void *buf)
|
||||
{
|
||||
int rc;
|
||||
struct hma_info *tmp_hma_info;
|
||||
struct hmon_health_check_complete_req_msg_v01 *req =
|
||||
(struct hmon_health_check_complete_req_msg_v01 *)buf;
|
||||
struct hmon_health_check_complete_resp_msg_v01 resp;
|
||||
bool hma_info_found = false;
|
||||
|
||||
if (!req->result_valid) {
|
||||
SHM_ERR("%s: Invalid result\n", __func__);
|
||||
goto send_resp;
|
||||
}
|
||||
|
||||
mutex_lock(&hma_info_list_lock);
|
||||
list_for_each_entry(tmp_hma_info, &hma_info_list, list) {
|
||||
if (tmp_hma_info->conn_h != conn_h)
|
||||
continue;
|
||||
hma_info_found = true;
|
||||
if (req->result == HEALTH_MONITOR_CHECK_SUCCESS_V01) {
|
||||
atomic_inc(&tmp_hma_info->report_count);
|
||||
SHM_INFO_LOG("%s: %s Health Check Success\n",
|
||||
__func__, tmp_hma_info->subsys_name);
|
||||
} else {
|
||||
SHM_INFO_LOG("%s: %s Health Check Failure\n",
|
||||
__func__, tmp_hma_info->subsys_name);
|
||||
}
|
||||
}
|
||||
mutex_unlock(&hma_info_list_lock);
|
||||
|
||||
send_resp:
|
||||
if (hma_info_found) {
|
||||
memset(&resp, 0, sizeof(resp));
|
||||
} else {
|
||||
resp.resp.result = QMI_RESULT_FAILURE_V01;
|
||||
resp.resp.error = QMI_ERR_INVALID_ID_V01;
|
||||
}
|
||||
rc = qmi_send_resp(shm_svc_handle, conn_h, req_h,
|
||||
&shm_svc_health_check_complete_resp_desc,
|
||||
&resp, sizeof(resp));
|
||||
if (rc < 0)
|
||||
SHM_ERR("%s: send_resp failed - rc %d\n",
|
||||
__func__, rc);
|
||||
return rc;
|
||||
}
|
||||
|
||||
/**
|
||||
* shm_svc_req_worker() - Worker to handle QMI requests
|
||||
* @work: Reference to the work item.
|
||||
*
|
||||
* This function handles QMI requests from HMAs in a deferred manner.
|
||||
*/
|
||||
static void shm_svc_req_worker(struct work_struct *work)
|
||||
{
|
||||
struct req_work *rwp =
|
||||
container_of(work, struct req_work, work);
|
||||
|
||||
switch (rwp->msg_id) {
|
||||
case QMI_HEALTH_MON_REG_REQ_V01:
|
||||
handle_health_mon_reg_req(rwp->conn_h, rwp->req_h, rwp->req);
|
||||
break;
|
||||
|
||||
case QMI_HEALTH_MON_HEALTH_CHECK_COMPLETE_REQ_V01:
|
||||
handle_health_mon_health_check_complete_req(rwp->conn_h,
|
||||
rwp->req_h, rwp->req);
|
||||
break;
|
||||
default:
|
||||
SHM_ERR("%s: Invalid msg_id %d\n", __func__, rwp->msg_id);
|
||||
}
|
||||
kfree(rwp->req);
|
||||
kfree(rwp);
|
||||
}
|
||||
|
||||
/**
|
||||
* shm_svc_req_cb() - Callback to notify about QMI requests from HMA
|
||||
* @handle; QMI Service handle in which the request is received.
|
||||
* @conn_h: Opaque reference to the connection handle to an HMA.
|
||||
* @req_h: Opaque reference to the request handle.
|
||||
* @msg_id: Message ID of the request.
|
||||
* @req: Pointer to the request structure.
|
||||
*
|
||||
* This function is called by kernel QMI Service Interface to notify the
|
||||
* incoming QMI request on the SHM service handle.
|
||||
*
|
||||
* Return: 0 on success, standard Linux error codes on failure.
|
||||
*/
|
||||
static int shm_svc_req_cb(struct qmi_handle *handle, void *conn_h,
|
||||
void *req_h, unsigned int msg_id, void *req)
|
||||
{
|
||||
struct req_work *rwp;
|
||||
void *req_buf;
|
||||
uint32_t req_sz = 0;
|
||||
|
||||
rwp = kzalloc(sizeof(*rwp), GFP_KERNEL);
|
||||
if (!rwp) {
|
||||
SHM_ERR("%s: Error allocating work item\n", __func__);
|
||||
return -ENOMEM;
|
||||
}
|
||||
|
||||
switch (msg_id) {
|
||||
case QMI_HEALTH_MON_REG_REQ_V01:
|
||||
req_sz = sizeof(struct hmon_register_req_msg_v01);
|
||||
break;
|
||||
|
||||
case QMI_HEALTH_MON_HEALTH_CHECK_COMPLETE_REQ_V01:
|
||||
req_sz = sizeof(struct hmon_health_check_complete_req_msg_v01);
|
||||
break;
|
||||
|
||||
default:
|
||||
SHM_ERR("%s: Invalid msg_id %d\n", __func__, msg_id);
|
||||
kfree(rwp);
|
||||
return -ENOTSUPP;
|
||||
}
|
||||
|
||||
req_buf = kzalloc(req_sz, GFP_KERNEL);
|
||||
if (!req_buf) {
|
||||
SHM_ERR("%s: Error allocating request buffer\n", __func__);
|
||||
kfree(rwp);
|
||||
return -ENOMEM;
|
||||
}
|
||||
memcpy(req_buf, req, req_sz);
|
||||
|
||||
INIT_WORK(&rwp->work, shm_svc_req_worker);
|
||||
rwp->conn_h = conn_h;
|
||||
rwp->req_h = req_h;
|
||||
rwp->msg_id = msg_id;
|
||||
rwp->req = req_buf;
|
||||
queue_work(shm_svc_workqueue, &rwp->work);
|
||||
return 0;
|
||||
}
|
||||
|
||||
/**
|
||||
* shm_svc_recv_msg() - Worker to receive a QMI message
|
||||
* @work: Reference to the work item.
|
||||
*
|
||||
* This function handles any incoming QMI messages to the SHM QMI service.
|
||||
*/
|
||||
static void shm_svc_recv_msg(struct work_struct *work)
|
||||
{
|
||||
int rc;
|
||||
|
||||
do {
|
||||
SHM_DEBUG("%s: Notified about a receive event\n", __func__);
|
||||
} while ((rc = qmi_recv_msg(shm_svc_handle)) == 0);
|
||||
|
||||
if (rc != -ENOMSG)
|
||||
SHM_ERR("%s: Error %d receiving message\n", __func__, rc);
|
||||
}
|
||||
|
||||
/**
|
||||
* shm_svc_notify() - Callback function to receive SHM QMI service events
|
||||
* @handle: QMI handle in which the event is received.
|
||||
* @event: Type of the QMI event.
|
||||
* @priv: Opaque reference to the private data as registered by the
|
||||
* service.
|
||||
*/
|
||||
static void shm_svc_notify(struct qmi_handle *handle,
|
||||
enum qmi_event_type event, void *priv)
|
||||
{
|
||||
switch (event) {
|
||||
case QMI_RECV_MSG:
|
||||
queue_delayed_work(shm_svc_workqueue, &work_recv_msg, 0);
|
||||
break;
|
||||
default:
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
static struct qmi_svc_ops_options shm_svc_ops_options = {
|
||||
.version = 1,
|
||||
.service_id = HMON_SERVICE_ID_V01,
|
||||
.service_vers = HMON_SERVICE_VERS_V01,
|
||||
.service_ins = 0,
|
||||
.connect_cb = shm_svc_connect_cb,
|
||||
.disconnect_cb = shm_svc_disconnect_cb,
|
||||
.req_desc_cb = shm_svc_req_desc_cb,
|
||||
.req_cb = shm_svc_req_cb,
|
||||
};
|
||||
|
||||
static int system_health_monitor_open(struct inode *inode, struct file *file)
|
||||
{
|
||||
SHM_DEBUG("%s by %s\n", __func__, current->comm);
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int system_health_monitor_release(struct inode *inode,
|
||||
struct file *file)
|
||||
{
|
||||
SHM_DEBUG("%s by %s\n", __func__, current->comm);
|
||||
return 0;
|
||||
}
|
||||
|
||||
static ssize_t system_health_monitor_write(struct file *file,
|
||||
const char __user *buf, size_t count, loff_t *ppos)
|
||||
{
|
||||
SHM_ERR("%s by %s\n", __func__, current->comm);
|
||||
return -ENOTSUPP;
|
||||
}
|
||||
|
||||
static ssize_t system_health_monitor_read(struct file *file, char __user *buf,
|
||||
size_t count, loff_t *ppos)
|
||||
{
|
||||
SHM_ERR("%s by %s\n", __func__, current->comm);
|
||||
return -ENOTSUPP;
|
||||
}
|
||||
|
||||
static long system_health_monitor_ioctl(struct file *file, unsigned int cmd,
|
||||
unsigned long arg)
|
||||
{
|
||||
int rc;
|
||||
|
||||
switch (cmd) {
|
||||
case CHECK_SYSTEM_HEALTH_IOCTL:
|
||||
SHM_INFO_LOG("%s by %s\n", __func__, current->comm);
|
||||
rc = kern_check_system_health();
|
||||
break;
|
||||
default:
|
||||
SHM_ERR("%s: Invalid cmd %d by %s\n",
|
||||
__func__, cmd, current->comm);
|
||||
rc = -EINVAL;
|
||||
}
|
||||
return rc;
|
||||
}
|
||||
|
||||
static const struct file_operations system_health_monitor_fops = {
|
||||
.owner = THIS_MODULE,
|
||||
.open = system_health_monitor_open,
|
||||
.release = system_health_monitor_release,
|
||||
.read = system_health_monitor_read,
|
||||
.write = system_health_monitor_write,
|
||||
.unlocked_ioctl = system_health_monitor_ioctl,
|
||||
.compat_ioctl = system_health_monitor_ioctl,
|
||||
};
|
||||
|
||||
/**
|
||||
* start_system_health_monitor_service() - Start the SHM QMI service
|
||||
*
|
||||
* This function registers the SHM QMI service, if it is not already
|
||||
* registered.
|
||||
*/
|
||||
static int start_system_health_monitor_service(void)
|
||||
{
|
||||
int rc;
|
||||
|
||||
shm_svc_workqueue = create_singlethread_workqueue("shm_svc");
|
||||
if (!shm_svc_workqueue) {
|
||||
SHM_ERR("%s: Error creating workqueue\n", __func__);
|
||||
return -EFAULT;
|
||||
}
|
||||
|
||||
shm_svc_handle = qmi_handle_create(shm_svc_notify, NULL);
|
||||
if (!shm_svc_handle) {
|
||||
SHM_ERR("%s: Creating shm_svc_handle failed\n", __func__);
|
||||
rc = -ENOMEM;
|
||||
goto start_svc_error1;
|
||||
}
|
||||
|
||||
rc = qmi_svc_register(shm_svc_handle, &shm_svc_ops_options);
|
||||
if (rc < 0) {
|
||||
SHM_ERR("%s: Registering shm svc failed - %d\n", __func__, rc);
|
||||
goto start_svc_error2;
|
||||
}
|
||||
return 0;
|
||||
start_svc_error2:
|
||||
qmi_handle_destroy(shm_svc_handle);
|
||||
start_svc_error1:
|
||||
destroy_workqueue(shm_svc_workqueue);
|
||||
return rc;
|
||||
}
|
||||
|
||||
/**
|
||||
* parse_devicetree() - Parse the device tree for HMA information
|
||||
* @node: Pointer to the device tree node.
|
||||
* @hma: HMA information which needs to be extracted.
|
||||
*
|
||||
* This function parses the device tree, extracts the HMA information.
|
||||
*
|
||||
* Return: 0 on success, standard Linux error codes on failure.
|
||||
*/
|
||||
static int parse_devicetree(struct device_node *node,
|
||||
struct hma_info *hma)
|
||||
{
|
||||
char *key;
|
||||
const char *subsys_name;
|
||||
const char *ssrestart_string;
|
||||
|
||||
key = "qcom,subsys-name";
|
||||
subsys_name = of_get_property(node, key, NULL);
|
||||
if (!subsys_name)
|
||||
goto error;
|
||||
strlcpy(hma->subsys_name, subsys_name, SUBSYS_NAME_LEN);
|
||||
|
||||
key = "qcom,ssrestart-string";
|
||||
ssrestart_string = of_get_property(node, key, NULL);
|
||||
if (!ssrestart_string)
|
||||
goto error;
|
||||
strlcpy(hma->ssrestart_string, ssrestart_string, SSRESTART_STRLEN);
|
||||
return 0;
|
||||
error:
|
||||
SHM_ERR("%s: missing key: %s\n", __func__, key);
|
||||
return -ENODEV;
|
||||
}
|
||||
|
||||
/**
|
||||
* system_health_monitor_probe() - Probe function to construct HMA info
|
||||
* @pdev: Platform device pointing to a device tree node.
|
||||
*
|
||||
* This function extracts the HMA information from the device tree, constructs
|
||||
* it and adds it to the global list.
|
||||
*
|
||||
* Return: 0 on success, standard Linux error codes on failure.
|
||||
*/
|
||||
static int system_health_monitor_probe(struct platform_device *pdev)
|
||||
{
|
||||
int rc;
|
||||
struct hma_info *hma, *tmp_hma;
|
||||
struct device_node *node;
|
||||
|
||||
mutex_lock(&hma_info_list_lock);
|
||||
for_each_child_of_node(pdev->dev.of_node, node) {
|
||||
hma = kzalloc(sizeof(*hma), GFP_KERNEL);
|
||||
if (!hma) {
|
||||
SHM_ERR("%s: Error allocation hma_info\n", __func__);
|
||||
rc = -ENOMEM;
|
||||
goto probe_err;
|
||||
}
|
||||
|
||||
rc = parse_devicetree(node, hma);
|
||||
if (rc) {
|
||||
SHM_ERR("%s Failed to parse Device Tree\n", __func__);
|
||||
kfree(hma);
|
||||
goto probe_err;
|
||||
}
|
||||
|
||||
init_srcu_struct(&hma->reset_srcu);
|
||||
hma->restart_nb.notifier_call = restart_notifier_cb;
|
||||
hma->restart_nb_h = subsys_notif_register_notifier(
|
||||
hma->ssrestart_string, &hma->restart_nb);
|
||||
if (IS_ERR_OR_NULL(hma->restart_nb_h)) {
|
||||
cleanup_srcu_struct(&hma->reset_srcu);
|
||||
kfree(hma);
|
||||
rc = -EFAULT;
|
||||
SHM_ERR("%s: Error registering restart notif for %s\n",
|
||||
__func__, hma->ssrestart_string);
|
||||
goto probe_err;
|
||||
}
|
||||
|
||||
list_add_tail(&hma->list, &hma_info_list);
|
||||
SHM_INFO_LOG("%s: Added HMA info for %s\n",
|
||||
__func__, hma->subsys_name);
|
||||
}
|
||||
|
||||
rc = start_system_health_monitor_service();
|
||||
if (rc) {
|
||||
SHM_ERR("%s Failed to start service %d\n", __func__, rc);
|
||||
goto probe_err;
|
||||
}
|
||||
mutex_unlock(&hma_info_list_lock);
|
||||
return 0;
|
||||
probe_err:
|
||||
list_for_each_entry_safe(hma, tmp_hma, &hma_info_list, list) {
|
||||
list_del(&hma->list);
|
||||
subsys_notif_unregister_notifier(hma->restart_nb_h,
|
||||
&hma->restart_nb);
|
||||
cleanup_srcu_struct(&hma->reset_srcu);
|
||||
kfree(hma);
|
||||
}
|
||||
mutex_unlock(&hma_info_list_lock);
|
||||
return rc;
|
||||
}
|
||||
|
||||
static struct of_device_id system_health_monitor_match_table[] = {
|
||||
{ .compatible = "qcom,system-health-monitor" },
|
||||
{},
|
||||
};
|
||||
|
||||
static struct platform_driver system_health_monitor_driver = {
|
||||
.probe = system_health_monitor_probe,
|
||||
.driver = {
|
||||
.name = MODULE_NAME,
|
||||
.owner = THIS_MODULE,
|
||||
.of_match_table = system_health_monitor_match_table,
|
||||
},
|
||||
};
|
||||
|
||||
/**
|
||||
* system_health_monitor_init() - Initialize the system health monitor module
|
||||
*
|
||||
* This functions registers a platform driver to probe for and extract the HMA
|
||||
* information. This function registers the character device interface to the
|
||||
* user-space.
|
||||
*
|
||||
* Return: 0 on success, standard Linux error codes on failure.
|
||||
*/
|
||||
static int __init system_health_monitor_init(void)
|
||||
{
|
||||
int rc;
|
||||
|
||||
shm_ilctxt = ipc_log_context_create(SHM_ILCTXT_NUM_PAGES, "shm", 0);
|
||||
if (!shm_ilctxt) {
|
||||
SHM_ERR("%s: Unable to create SHM logging context\n", __func__);
|
||||
shm_debug_mask = 0;
|
||||
}
|
||||
|
||||
rc = platform_driver_register(&system_health_monitor_driver);
|
||||
if (rc) {
|
||||
SHM_ERR("%s: system_health_monitor_driver register failed %d\n",
|
||||
__func__, rc);
|
||||
return rc;
|
||||
}
|
||||
|
||||
rc = alloc_chrdev_region(&system_health_monitor_dev,
|
||||
0, 1, "system_health_monitor");
|
||||
if (rc < 0) {
|
||||
SHM_ERR("%s: alloc_chrdev_region() failed %d\n", __func__, rc);
|
||||
return rc;
|
||||
}
|
||||
|
||||
system_health_monitor_classp = class_create(THIS_MODULE,
|
||||
"system_health_monitor");
|
||||
if (IS_ERR_OR_NULL(system_health_monitor_classp)) {
|
||||
SHM_ERR("%s: class_create() failed\n", __func__);
|
||||
rc = -ENOMEM;
|
||||
goto init_error1;
|
||||
}
|
||||
|
||||
cdev_init(&system_health_monitor_cdev, &system_health_monitor_fops);
|
||||
system_health_monitor_cdev.owner = THIS_MODULE;
|
||||
rc = cdev_add(&system_health_monitor_cdev,
|
||||
system_health_monitor_dev , 1);
|
||||
if (rc < 0) {
|
||||
SHM_ERR("%s: cdev_add() failed - rc %d\n",
|
||||
__func__, rc);
|
||||
goto init_error2;
|
||||
}
|
||||
|
||||
system_health_monitor_devp = device_create(system_health_monitor_classp,
|
||||
NULL, system_health_monitor_dev, NULL,
|
||||
"system_health_monitor");
|
||||
if (IS_ERR_OR_NULL(system_health_monitor_devp)) {
|
||||
SHM_ERR("%s: device_create() failed - rc %d\n",
|
||||
__func__, rc);
|
||||
rc = PTR_ERR(system_health_monitor_devp);
|
||||
goto init_error3;
|
||||
}
|
||||
SHM_INFO_LOG("%s: Complete\n", __func__);
|
||||
return 0;
|
||||
init_error3:
|
||||
cdev_del(&system_health_monitor_cdev);
|
||||
init_error2:
|
||||
class_destroy(system_health_monitor_classp);
|
||||
init_error1:
|
||||
unregister_chrdev_region(MAJOR(system_health_monitor_dev), 1);
|
||||
return rc;
|
||||
}
|
||||
|
||||
module_init(system_health_monitor_init);
|
||||
MODULE_DESCRIPTION("System Health Monitor");
|
||||
MODULE_LICENSE("GPL v2");
|
134
drivers/soc/qcom/system_health_monitor_v01.c
Normal file
134
drivers/soc/qcom/system_health_monitor_v01.c
Normal file
|
@ -0,0 +1,134 @@
|
|||
/* Copyright (c) 2014, The Linux Foundation. All rights reserved.
|
||||
*
|
||||
* This program is free software; you can redistribute it and/or modify
|
||||
* it under the terms of the GNU General Public License version 2 and
|
||||
* only version 2 as published by the Free Software Foundation.
|
||||
*
|
||||
* This program is distributed in the hope that it will be useful,
|
||||
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
* GNU General Public License for more details.
|
||||
*
|
||||
*/
|
||||
#include <linux/qmi_encdec.h>
|
||||
|
||||
#include <soc/qcom/msm_qmi_interface.h>
|
||||
|
||||
#include "system_health_monitor_v01.h"
|
||||
|
||||
struct elem_info hmon_register_req_msg_v01_ei[] = {
|
||||
{
|
||||
.data_type = QMI_OPT_FLAG,
|
||||
.elem_len = 1,
|
||||
.elem_size = sizeof(uint8_t),
|
||||
.is_array = NO_ARRAY,
|
||||
.tlv_type = 0x10,
|
||||
.offset = offsetof(struct hmon_register_req_msg_v01,
|
||||
name_valid),
|
||||
},
|
||||
{
|
||||
.data_type = QMI_STRING,
|
||||
.elem_len = 255,
|
||||
.elem_size = sizeof(char),
|
||||
.is_array = NO_ARRAY,
|
||||
.tlv_type = 0x10,
|
||||
.offset = offsetof(struct hmon_register_req_msg_v01,
|
||||
name),
|
||||
},
|
||||
{
|
||||
.data_type = QMI_OPT_FLAG,
|
||||
.elem_len = 1,
|
||||
.elem_size = sizeof(uint8_t),
|
||||
.is_array = NO_ARRAY,
|
||||
.tlv_type = 0x11,
|
||||
.offset = offsetof(struct hmon_register_req_msg_v01,
|
||||
timeout_valid),
|
||||
},
|
||||
{
|
||||
.data_type = QMI_UNSIGNED_4_BYTE,
|
||||
.elem_len = 1,
|
||||
.elem_size = sizeof(uint32_t),
|
||||
.is_array = NO_ARRAY,
|
||||
.tlv_type = 0x11,
|
||||
.offset = offsetof(struct hmon_register_req_msg_v01,
|
||||
timeout),
|
||||
},
|
||||
{
|
||||
.data_type = QMI_EOTI,
|
||||
.is_array = NO_ARRAY,
|
||||
.is_array = QMI_COMMON_TLV_TYPE,
|
||||
},
|
||||
};
|
||||
|
||||
struct elem_info hmon_register_resp_msg_v01_ei[] = {
|
||||
{
|
||||
.data_type = QMI_STRUCT,
|
||||
.elem_len = 1,
|
||||
.elem_size = sizeof(struct qmi_response_type_v01),
|
||||
.is_array = NO_ARRAY,
|
||||
.tlv_type = 0x02,
|
||||
.offset = offsetof(struct hmon_register_resp_msg_v01,
|
||||
resp),
|
||||
.ei_array = get_qmi_response_type_v01_ei(),
|
||||
},
|
||||
{
|
||||
.data_type = QMI_EOTI,
|
||||
.is_array = NO_ARRAY,
|
||||
.is_array = QMI_COMMON_TLV_TYPE,
|
||||
},
|
||||
};
|
||||
|
||||
struct elem_info hmon_health_check_ind_msg_v01_ei[] = {
|
||||
{
|
||||
.data_type = QMI_EOTI,
|
||||
.is_array = NO_ARRAY,
|
||||
.is_array = QMI_COMMON_TLV_TYPE,
|
||||
},
|
||||
};
|
||||
|
||||
struct elem_info hmon_health_check_complete_req_msg_v01_ei[] = {
|
||||
{
|
||||
.data_type = QMI_OPT_FLAG,
|
||||
.elem_len = 1,
|
||||
.elem_size = sizeof(uint8_t),
|
||||
.is_array = NO_ARRAY,
|
||||
.tlv_type = 0x10,
|
||||
.offset = offsetof(
|
||||
struct hmon_health_check_complete_req_msg_v01,
|
||||
result_valid),
|
||||
},
|
||||
{
|
||||
.data_type = QMI_SIGNED_4_BYTE_ENUM,
|
||||
.elem_len = 1,
|
||||
.elem_size = sizeof(enum hmon_check_result_v01),
|
||||
.is_array = NO_ARRAY,
|
||||
.tlv_type = 0x10,
|
||||
.offset = offsetof(
|
||||
struct hmon_health_check_complete_req_msg_v01,
|
||||
result),
|
||||
},
|
||||
{
|
||||
.data_type = QMI_EOTI,
|
||||
.is_array = NO_ARRAY,
|
||||
.is_array = QMI_COMMON_TLV_TYPE,
|
||||
},
|
||||
};
|
||||
|
||||
struct elem_info hmon_health_check_complete_resp_msg_v01_ei[] = {
|
||||
{
|
||||
.data_type = QMI_STRUCT,
|
||||
.elem_len = 1,
|
||||
.elem_size = sizeof(struct qmi_response_type_v01),
|
||||
.is_array = NO_ARRAY,
|
||||
.tlv_type = 0x02,
|
||||
.offset = offsetof(
|
||||
struct hmon_health_check_complete_resp_msg_v01,
|
||||
resp),
|
||||
.ei_array = get_qmi_response_type_v01_ei(),
|
||||
},
|
||||
{
|
||||
.data_type = QMI_EOTI,
|
||||
.is_array = NO_ARRAY,
|
||||
.is_array = QMI_COMMON_TLV_TYPE,
|
||||
},
|
||||
};
|
66
drivers/soc/qcom/system_health_monitor_v01.h
Normal file
66
drivers/soc/qcom/system_health_monitor_v01.h
Normal file
|
@ -0,0 +1,66 @@
|
|||
/* Copyright (c) 2014, The Linux Foundation. All rights reserved.
|
||||
*
|
||||
* This program is free software; you can redistribute it and/or modify
|
||||
* it under the terms of the GNU General Public License version 2 and
|
||||
* only version 2 as published by the Free Software Foundation.
|
||||
*
|
||||
* This program is distributed in the hope that it will be useful,
|
||||
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
* GNU General Public License for more details.
|
||||
*
|
||||
*/
|
||||
#ifndef SYSTEM_HEALTH_MONITOR_V01_H
|
||||
#define SYSTEM_HEALTH_MONITOR_V01_H
|
||||
|
||||
#define HMON_SERVICE_ID_V01 0x3C
|
||||
#define HMON_SERVICE_VERS_V01 0x01
|
||||
|
||||
#define QMI_HEALTH_MON_HEALTH_CHECK_COMPLETE_REQ_V01 0x0022
|
||||
#define QMI_HEALTH_MON_HEALTH_CHECK_COMPLETE_RESP_V01 0x0022
|
||||
#define QMI_HEALTH_MON_HEALTH_CHECK_IND_V01 0x0021
|
||||
#define QMI_HEALTH_MON_REG_RESP_V01 0x0020
|
||||
#define QMI_HEALTH_MON_REG_REQ_V01 0x0020
|
||||
|
||||
struct hmon_register_req_msg_v01 {
|
||||
uint8_t name_valid;
|
||||
char name[256];
|
||||
uint8_t timeout_valid;
|
||||
uint32_t timeout;
|
||||
};
|
||||
#define HMON_REGISTER_REQ_MSG_V01_MAX_MSG_LEN 265
|
||||
extern struct elem_info hmon_register_req_msg_v01_ei[];
|
||||
|
||||
struct hmon_register_resp_msg_v01 {
|
||||
struct qmi_response_type_v01 resp;
|
||||
};
|
||||
#define HMON_REGISTER_RESP_MSG_V01_MAX_MSG_LEN 7
|
||||
extern struct elem_info hmon_register_resp_msg_v01_ei[];
|
||||
|
||||
struct hmon_health_check_ind_msg_v01 {
|
||||
char placeholder;
|
||||
};
|
||||
#define HMON_HEALTH_CHECK_IND_MSG_V01_MAX_MSG_LEN 0
|
||||
extern struct elem_info hmon_health_check_ind_msg_v01_ei[];
|
||||
|
||||
enum hmon_check_result_v01 {
|
||||
HMON_CHECK_RESULT_MIN_VAL_V01 = INT_MIN,
|
||||
HEALTH_MONITOR_CHECK_SUCCESS_V01 = 0,
|
||||
HEALTH_MONITOR_CHECK_FAILURE_V01 = 1,
|
||||
HMON_CHECK_RESULT_MAX_VAL_V01 = INT_MAX,
|
||||
};
|
||||
|
||||
struct hmon_health_check_complete_req_msg_v01 {
|
||||
uint8_t result_valid;
|
||||
enum hmon_check_result_v01 result;
|
||||
};
|
||||
#define HMON_HEALTH_CHECK_COMPLETE_REQ_MSG_V01_MAX_MSG_LEN 7
|
||||
extern struct elem_info hmon_health_check_complete_req_msg_v01_ei[];
|
||||
|
||||
struct hmon_health_check_complete_resp_msg_v01 {
|
||||
struct qmi_response_type_v01 resp;
|
||||
};
|
||||
#define HMON_HEALTH_CHECK_COMPLETE_RESP_MSG_V01_MAX_MSG_LEN 7
|
||||
extern struct elem_info hmon_health_check_complete_resp_msg_v01_ei[];
|
||||
|
||||
#endif /* SYSTEM_HEALTH_MONITOR_V01_H */
|
34
include/soc/qcom/system_health_monitor.h
Normal file
34
include/soc/qcom/system_health_monitor.h
Normal file
|
@ -0,0 +1,34 @@
|
|||
/* Copyright (c) 2014, The Linux Foundation. All rights reserved.
|
||||
*
|
||||
* This program is free software; you can redistribute it and/or modify
|
||||
* it under the terms of the GNU General Public License version 2 and
|
||||
* only version 2 as published by the Free Software Foundation.
|
||||
*
|
||||
* This program is distributed in the hope that it will be useful,
|
||||
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
* GNU General Public License for more details.
|
||||
*/
|
||||
|
||||
#ifndef SYSTEM_HEALTH_MONITOR_H
|
||||
#define SYSTEM_HEALTH_MONITOR_H
|
||||
|
||||
#ifdef CONFIG_SYSTEM_HEALTH_MONITOR
|
||||
/**
|
||||
* kern_check_system_health() - Check the system health
|
||||
*
|
||||
* This function is used by the kernel drivers to initiate the
|
||||
* system health check. This function in turn trigger SHM to send
|
||||
* QMI message to all the HMAs connected to it.
|
||||
*
|
||||
* Return: 0 on success, standard Linux error codes on failure.
|
||||
*/
|
||||
int kern_check_system_health(void);
|
||||
#else
|
||||
static inline int kern_check_system_health(void)
|
||||
{
|
||||
return -ENODEV;
|
||||
}
|
||||
#endif /* CONFIG_SYSTEM_HEALTH_MONITOR */
|
||||
|
||||
#endif /* SYSTEM_HEALTH_MONITOR_H */
|
Loading…
Add table
Reference in a new issue